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5 

SEQUENCE LISTING 

SEQ ID NO 1 : Clostridium pasteuranum hydrogoiase; Genbank accession number M8 1737 
gi:144835 

10 MKTIimGVQFNTDEDTTILKFARDNNIDISALCFLNNCNNDINKCEICTVEVEGTGLV 
TACDTLIEDGMIINTNSDAVNEKIKSRISQLLDIHEFKCGPCNRRENCEFLKLVIKYKA 
RASKPFLPKDKTEWDERSKSLTVDRTKCLLCGRCVNACGKNTETYAMKFLNKNGK 
T1IGAEDEKCFDDTNCLLCGQCIIACPVAA15EKSHN1DRVKNALNAPEKHVIVAMAP 
SVRASIGELFNMGFGVDVTGKTyTALRQLGFDKIFDINFGADMTIMEEATELVQRIEN 

1 5 NGPFPMFTSCCPGWVRQAENYYPELLNNLSSAKSPQQIFGTASKTYYPSISGLDPKNV 
FTVTVNffCTSKKFEADRPQMEKDGLRDIDAVITTRELAKMIKDAKIPFAKLEDSEAD 
PAMGEYSGAGAIFGATGGVMEAALRSAKDFAENAELEDIEYKQVRGLNGIKEAEVEI 
NNNKYlsrVAVmGASNLFKFMKSGMINEKQYHFIEVMACHGGCVNGGGQPHVW 
LEKVDIKKVRASVLYNQDEHLSKRKSHENTALVKMYQNYFGKPGEGRAHEILHFKY 

20 KK 

SEQ ID NO 2: Desulfovibrio vulgaris hydrogenase; Genbank accession number GI:40830 
X02416 

25 MSRTVMERIEYEMHTPDPKADPDKLHFVQIDEAKCIGCDTCSQYCPTAAIFGEMGEP 
HSIPfflEACINCGQCLTHCPENAIYEAQSWVPEVEKKLKDGKYKCIAMPAPAVRYAL 
GDAFGMPVGSVTTGKMLAALQKLGFAHCWDTEFTADVTIWEEGSEFVERLTKKSD 
MPLPQFTSCCPGWQKYAETYYPELLPHFSTCKSPIGMNGALAKTYGAERMKYDPKQ 
VYTVSIMPCIAKKYEGLRPELKSSGMRDIDATLTTRELAYMIKKAGIDFAKLPDGK^ 

30 DSLMGESTGGATIFGVTGGVMEAALRFAYEAVTGKKPDSWDFKAVRGLDGIKEATV 
NVGGTDVKVAWHGAKRFKQVCDDVKAGKSPYHFIEYMACPGGCVCGGGQPVMP 
GVLEAMDRTTTRLYAGLKKRLAMASANKA 
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SEQ ID NO 3: Entamoeba histolytica hydrogenase; Genbank accession number 01:9963974 
AF248542 

MPPKPSHTLTGHDHNfHSIQFDWSKCMGCGMCATKCTFGVLVKQPPKIPPFVQPNRE 

KLSQEhTTOKTRVLroESECTGCGQCSLVCOTGSITPIDHLVDTFKAKEAGKKLVAMIA 

PSTRLGVAEAMGMPIGSTAMAQLVHCLmGIT)Y\lT)VDAGADKTTMDDYAEVIE 

MKKEGKGPATTSCCPAWIELVEKEYPDLIPNVSTARSPIGCLAGCIKRG 

WAKDVGIAVEDLYTVGIMPCIAKKTESQRQQMQDYDASCTSNEIAAYFKKHLPPEE 

CKFTQEREEALAKTEDGQCDLPFRRISGGSNIFGKTGGVCETVLRVIARNAGVDWNS 

CTVNKEETFKHAASGSTMTNLSVDIGGTIITGAVCHGGYAIRHACELIRKGELKVDV 

VEMMACVGGCLGGAGQPKIPPAKKLE^IDKRRVMLDILDQQTDIRAANE^fTDVLGW 

IDKHFDHQGAHQHLHTYFTPRYQN 



SEQ ro NO 4: NC_001146. Sacdiaromyces cerevisiae GI:6324089 

MSALLSESDUSfDnSPALACVKPTQVSGGKKDNAnSIMNGEYEVSTEPIXJLEKVSITLS 

DCLACSGCrrSSEEILLSSQSHSVFLKNWGKLSQQQDKFLWSVSPQCRLSLAQYYGL 

TLEAADLCLMNFFQKHFQCKYMVGTEMGRnSISKTVEKIIAHKKQKENTGADRKPL 

LSAVCPGFLIYTEKTKPQLVPMLLNVKSPQQITGSLIRATFESLAIARESFYHLSLMPCF 

DKKLEASRPESLDDGIDCVITPREIVTMLQELNLDFKSFLTEDTSLYGRLSPPGWDPR 

VHWASNLGGTCGGYAYQYVTAVQRLHPGSQMIVLEGRNSDIVEYRLLHDDRIIAAA 

SELSGFRNIQNLVRKLTSGSGSERKRNITALRKRRTGPKANSREMAAATAATADPYH 

SDYIEVNACPGACMNGGGLLNGEQNSLKRKQLVQTLNKRHGEELAMVDPLTLGPK 

LEEAAARPLSLEYVFAPVKQAVEKDLVSVGSTW 



SEQ ID NO 5: CWorella fiisca GI:21732235 Ar298227 

MCCPWASRHAGRARHVAVRAAGPTSECDCPPTPQAKLPHWQQALDELAKPKESR 

RLMIAQIASAVRVAIAETIGLAPGDVTIGQLVTGLRMLGFDYVFDTLFGADLTIMEEG 

TELLHRLQDHLEQHPNKEEPLPMFTSCCPGWVAMVEKSNPELIPYLSSCKSPQMMLG 

AVIKNYYAQQVGVQPSDICNVSVMPCVRKQGEADREWFNTTGAGLARDVDHVVTT 

AEVGKIFLERGIKLNELPESNFDNPIGEGTGGALLFGTTGGVMEAALRTVYEVVTQKP 
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MGRVDFEEVRGLEGKEAErrLKPGDDSPFKAFAGADGQGITLKIAVANGLGNAKKL 

IKSLSEGKJ^YDFffiVMACPGGCIGGGGQPRSTDKQILQKRQQAMYNLDERSTIRRS 

HDNPFIQALYDKFLGAPNSHKAHDLLHTHYVAGGIPEEK 

5 SEQIDN06: Clostridium saccharobutylicum GI:488597 U09760 

MIMVIDEKTIQVQENTTVIQAALANGIDIPSLCYLNECGNVGKCGVCAVEIEGKNNL 
ALACITKVEEGMVVKTNSEKVQERVKMRVATLLDKHEFKCGPCPRRENCEFLKLVI 
KTKAKANKPFVVEDKSQYIDIRSKSIVIDRTKCVLCGRCEAACKTKTGTGAISICKSES 

10 GRTVQATGGKCFDDTNCLLCGQCVAACPVGALTEKTHVDRVKEAL 

EDPNKHVIVAMAPSIRTSMGELFKLGYGVDVTGKLYASMRALGFDKVFDINFGADM 
TIMEEATEFffiRVKhnSIGPFPMFTSCCPAWVRQVENYYPEFLENLSSAKSPQQIFGAAS 
KTYYPQISGISAKDVFTVTIMPCTAKKFEADREENrraEGIKNroAVLTTRELAKMI^ 
AKINFANLEDEQADPAMGEYTGAGVIFGATGGVMEAALRTAKDFVEDKDLTDIEYT 

1 5 QmGLQGIKEATVEIGGENYNVAVmGAANLAEFMNSGKILEKNYHFIEVMACPGGC 
WGGGQPHVSAKEREK\T)VRTVRASVLYNQDKmEiaiKSHKm'ALLNMYYDYMG 
APGQGKAHELLHLKYNK 



20 SEQ ID NO 7: GI;145099 M272 12. Desulfo vibrio vulgaris [Fe]-hydrogenase alpha subunit 

MSRIEMEKIFYEDHAPDPKADPDKLFFIQIDESKCIGCDSCQQYCPTGAIFGDTGDAH 
KIPHEELCINCGQCLTHCPVGAryTSQSWVTEIEKKIKAKDVKVIAMPAPAVRYALGD 
AFGLPVGTVTTGKMFSALKELGFDHCWDNEFTADVTTWEEGTEFVQRLTKKLDKPL 
25 PQFTSCCPGWHKYVESLYPELFPHMSSCKSPIGMLGTLAKTYGADRMKYDRAKVYT 
VSIMPCTAKKYEGNDlPQLWDSGHKDroATIDTRELAYMIKKAKIDFTKLPDGKRDTL 
MGESTGGATLFGWGGVMEAAIJlYAYQAVTGKOESNnDFKGWGLQGVKEATVN 
VGGVDVKVAWHGARRFHDVCELVKAGKAPWHFIEFMACPGGCVCGGGQPVMPG 
VLEAADRRSTRMYAGLKKRLAMASASRA 

30 

SEQ ED NO 8: GI:145 100 M27212 Desulfovibrio vulgaris [Fej-hydrogaiase beta subunit 
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MQIVNLTRRGFLKAACVVTGGALISimTGKAVAAAKQLKDYMMDRINGVYGADA 

KFPVRASQDNVQVQKLYADFLEKJ>MSHKAEQLLHTHWVDRSKAffiRMKAQGAW 

PRAKEFEGNTYPYE 

5 SEQ ID NO 9: Desulfovibrio vulgaris (Hidenborough) hydrogenase [gi:40827] X57838 

MNAFINGKEVRCEPGRTILEAARENGHFIPTLCELADIGHAPGTCRVCLVEIWRDKEA 

GPQIWSCTTP\^EGMRIFTRTPEVRRMQRLQVELLLADHDHDCAACARHGDCELQ 

DVAQFVGLTGTRHHFPDYARSRTRDVSSPSWRDMGKCIRCLRCVAVCRNVQGVD 

1 0 ALWTGNGIGTEIGLRHNRSQS ASDC VGCGQCTLVCPVGALAGRDD VERVIDYLYDP 
EIVTVFQFAPAVRVGLGEEFGLPPGSSVEGQVPTALRLLGADVVLDTNFAADLVIME 
EGTEU.QRLRGGAKLPLFTSCCPGWVNFAEKHLPDILPHVSrrRSPQQCLGALAKTY 
LARTMNVAPERMRWSLNIPCTAKKEEAARPEFRRDGVRDVDAVLTTREFARLLRRE 
GIDLAGLEPSPCDDPLMGRATGAAVIFGTTGGVMEAALRTVYHVLNGKELAPVELH 

1 5 ALRGYENVREAWPLGEGNGSVKVAWHGLKAARQMVE AVLAGKADHVF VEVM 
ACPGGCMDGGGQPRSKRAYNPNAQARRAALFSLDAENALRQSHNNPLIGKVYESFL 
GEPCSNLSHRLLHTRYGDRKSEVAYTMRDIWHEMTLGRRVRGDSD 



20 SEQ ID NO 10: Clostridium perfringens hydrogenase [gi:4239871] AB016775 

MNKIIINDKTIEFDGDKTILDLARENGFDIPVLCELKNCGNKGQCGVCLVEQEGNDRL 
LRSCAIKAKDGMVIKTDSEKVLEARKERVAELLDEHEFKCGPCKRRENCEFLKLVIK 
TKARAHKPFWADKSEYVDDRSKSIVLDRSKCVKCGRCVAACRTRTATNSIKFHRID 

25 GVRLVGPEELKCFDDTNCLLCGQCIAACPVDALSEKSHIERVQEALNDPEKHVIVAM 
APAVRTSMGELFKMGYGQDVTGKLYTALRELGFDKVFDINFGADMTIMEEATELIE 
RIK]SINGPFPMLTSCCPS\VA^VENYFPELVENLSSAKSPQQIFGAASKTYYPQVADID 
PKKVFTVTVMPCTSKKFEADRPEMENEGIIOSiroAVlTTRELARMIKA^^ 
EVDPAMGEYTGAGVIFGATGGVMEAALRTAKDFMENDNLDNVDYEAVRGLAGIKE 

30 AEVEIAGNEYKLAVVSGAANVFELVKSGKINDYHFIEVMACPGGCVNGGGQPHISAE 
DSDKMDIREVRASVLYNQDKNLEKRKSHQNSALLKMYESYMGKPCaiGRAHELLH 
MKYKK 



70 



Attorney Docket Number: H2042101-CIP 
Inventor Harrison F. Dillon 



SEQIDNOll: Clostridium perfringem GI:4239897 AB016820 

MNKHINDKTIEFDCaDKmDLARENGFDIPVLCELKNCa^CGQCGVCLVEQEaSIDRL 
UISCAKAKDGMVIKTDSEKVLEARKERVAEIXDEHEFKCGPCKRRENCEFLKLV^ 
5 TKARAHKPFWADKSEYVDDRSKSIVLDRSKCVKCGRCVAACRTRTATNSIKFHRID 
GN^VGPEELKCFDDTNCLLCGQCIAACPVDALSEKSHIERVQDALNDPEKHVIVAM 
APAVRTSMGELFKMGYGQDVTGKLYTALRELGFDKVFDINFGADMTIMEEATELIE 
RIKNNGPFPMLTSCCPSWVREVENYFPELVENLSSAKSPQQIFGAASKTYYPQVADID 
PKKWTVTVMPCTSKKFEADRPEMEI^GIRMDAVITITIELARMIKAAKIDFAKLEDG 
10 EVDPAMGEYTGAGVIFGATGGVMEAALRTAKDFMENDNLDNVDYEAVRGLAGIKE 
AEVEIAGNEYKLAVVSGAANVFELVKSGKINDYHFIEVMACPGGCVNGGGQPHISAE 
DSDKTOIREVRASVLYNQDKNLEKRKSHQNSALLKMYENYMGKPGHGRAHELLHM 
KYKK 

15 

SEQIDN012: Megasphaera elsdenii GI:6650985 AF120457 

MFEFHSRFEKTORRVProEHNCAVQFDVrKCKNCTLCRRACADTQTVLDYYSLSSTG 
DMPlCVHCGQCSSACPFGArVEVNDVDKVKAALKDPEKTVIFQTAPAVRVGLGEAFG 

20 MDPGTFVEGKMVAALRTLGADYVFDTDFGADLTIMEEATELLHRLQSEEIPIPQFTSC 
CPAWVEFAETFYPDLLQHLSSTKSPISILSPVIKTYFAQQKMDPKKIVNVCVTPCTAK 
KAEIRRPELSASGLFWDEPEBRDTDICITTRELAQWIQDENroFASLEDSKFDKAFGEA 
SGGGWFGNSGGVMEAAniTAYHMFTGRPAPKI)FIPFEPVRGLQGVKKATVIFGHFV 
LHVAAISGLGNARAFroDLIK>fDAFEDYSFffiVMACPGGCIGGGGQPKVKLPQVKKV 

25 QEARTASIYKSDEETDIKASWQNPEIETLYEAFUJEPLSEMAEFrLHrYFSDKSDQLG 
RMKNLTPQTNPMSPKYKPPTEE 



SEQ ID NO 1 3: Desulfovibrio desulfuricans strain G20 [Fe] Itydrogenase GI: 1 3022069 
30 AF331719 

MNLVEMEKIQYVDQSPDPRANPDELFFIQIDPEKCIGCDTCQEYCPTGAIFGDTGSAH 
SIPHEEICINCGQCLTHCPVGAIYEVQSWVRELSEKIKDPEIKVIAMPAPAVRYGLGEC 
FGMPVGTVTTGKMLTALQMLGFDHVWDNEFTADVTIWEEGTEFVNIU.TGQIDKPLP 
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QFTSCCPGWHKy\^SF«>EU?PHLSSCKSPIGMMGALAKWGPDVMKYDRSKVYTV 
SIMPCTAKKYEGMRADLWSSGYKDroATIDTRELAYMIKKAGIDFAALPDGKRDTL 
MGDSTGGATIFGVSGGVMEAALRYAYEAWGKKPSSWDFTMW(a.NGIKEGTVTIG 
DAKINVAWHGAKRFAEVCEVIKTGKSPWHFffiFMACPGGCVCGGGQPVMPGVLEA 
5 MDRKVSRTFAGLKERLNRMSSSKA 



SEQ ID NO 14: U07229. Desulfovibrio fiuctosovorans Iron hydrogenaserogenase 
GI:466366 

10 

MSMLTrrroGKTTSWEGSmDAAKTLDroiPTLCYLNLEALSINNKAASCRVCVVEV 
EGRWSILAPSCATPVTDNMVVKTNSLRVLNARRTVLELLLSDHPKDCLVCAKSGECE 
LQTLAERFGIl^SPYDGGEMSHYRKDISASIIRDMDKCIMCRRCEmCNTVQTCGVLS 
GVNRGFTAVVAPAFEMNLADTVCTNCGQCVAVCPTGALVEHEYIWEVVEALANPD 

15 KWIVQTAPAVRAALGEDLGVAPGTSVTGKMAAALRRLCTDHVFDTDFAADLTIME 
EGSEFU)RL(m^LAGDT^ma.PILTSCCPGWVKFFEHQFPDMLDVPSTAKSPQQMFG 
AIAKTYYADLLGIPREia.VVVSVMPCLAKKYECARPEFSVNGNPDVDIVriTRELAKL 
VKRMNIDFAGLPDEDFDAPLGASTGAAPIFGVrGGVIEAALRTAYELATGETLKKVD 
FEDVRGMDGVKKAKVKVGDNELVIGVAHGLGNARELLKPCGAGETFHAIEVMACP 

20 GGCIGGGGQPYHHGDVELLKKRTQVLYAEDAGKPLRKSHENPYIIELYEKFLGKPLS 
ERSHQLLHTHYFKRQRL 



25 SEQ ID NO 15: Yl 1759 Desulfovibrio fiuctosovorans Iron hydrograias^ogenase 
GI: 1914864 

MSRffiMAKIFYEQTVPPPGTNLDQAYIVQVDETKCIGCDTCMGYCPTGAITGESGEPH 
KVVDPAACINCGQCLTHCPVAAIYETVSFVPEIEAKLKDKNVKVIAMPAPAVRYALG 
30 DPFGNfPLGAVTTEHMLTGLKQLGFDNVWDNEFIADVTIWEEGSELLARITKKLDKP 
LPQFTSCCPGWQKYAETFYPELLPHFSSCKSPIGMMGPLAKTYGAKELGYEPKQIYT 
VSIMPCTAKKFEGMRPEMDASGFRDIDATmTRELAYMMKKAGroLPKIANGIOaJA 
VMGESTGGATIFGVSGGVMEAALRFAYQALTKKPPQSWDFKAVRGLNGIKEATINIG 
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gtdvkvavvnckhcnfakvcdevkagkspyhfiefmacp<kk:vmgggqpim^^ 
smnrtttkfyaslkkrlalydaqka 



5 SEQIDN016: Thermotogamaritimahydrogaiase [gi:4980694] AE001705 

^mRFFK^^^LRNLSQNGETNSVRRCFALADVTWINGRTLTWDNLTVIEACEKAGIEI 

PALCHHPRLGESIGACRVCVVEVEGARNLQPACVTKVRDGMVIKTSSDRVKTARKF 

NLALLLSEHPNDCMTCEANGRCEFQDLIYKYDVEPIFGYGTKEGLVDRSSPAIVRDLS 

10 KCIKCQRCVRACSELQGMHIYSMVERGHRTYPGTPFDMPVYETDCIGCGQCAAFCPT 
GATVENSAVKWLEELEKKEKILWQTAPSVRVAIGEEFGYAPGTISTGQlvrVAALRR 
LGFDYWinNFGADLTIMEEGSEFLERLEK<H)LEDLPMFTSCCPGWVNLVEKVYPEL 
RTTU^SAKSPQGMLSAMVKTYFAEKLGVKPEDIFHVSIMPCTAKKDEALRKQLMVN 
GWAVDWLTTRELGKLIRMKKffFANLPEEEYDAPLGISTGAAALFGVTGGVMEAA 

15 LRTAYELKTGKALPKIVFEEVRGLKGVREAEroLIXjKKIRIAVVHGTANVRNLVEKIL 
RREVKYHFVEVMACPGGCIGGGGQPYSRDPEILRKRAEAIYTroERMTLRKSHENPAI 
KKLYEEYLEHPLSHKAHELLHTYYEDRSRKKRLAVK 



20 

SEQ ID NO 1 7: AE001794. Thermotoga maritima alpha subunit hydrogenase GI:4981990 

\IKma)GREVIINDNERNLLEALKNVGIEIPNLCYLSEASIYGACRMCLVEINGQITTS 
CTLKPYEGMKVKTbTITErV^MRRNILELILATHNRDCTTCDRNGSCKLQKYAEDFGI 

25 RKIRFEALKKEHVRDESAPVVRDTSKCILCGDCVRVCEEIQGVGVIEFAKRGFESVVT 
TAFDTPLffiTECVLCGQCVAYCPTGAI^IRNDIDKLIEALESDKIVIGMIAPAVRAAIQE 
EFGroEDVAMAEKLVSFLKTIGFDKVFDVSFGADLVAYEEAHEFYERLKKGERLPQF 
TSCCPAWVKHAEHTYPQYLQNLSSVKSPQQALGTVIKKIYARKLGVPEEKIFLVSFM 
PCTAKKFEAEREEHEGIVDIVLTTRELAQLIKMSRIDINRVEPQPFDRPYGVSSQAGLG 

30 FGKAGGWSCVLSVLNEEIGffiKVDVKSPEIXjIRVAEVrLKDGTSFKGAVIYGLGKV 
KKFlJEERKDVEIffiVMACNYGCVGGGGQPYPNDSRIREHRAKVLRDTMGIKSLLl^ 
ENLFLMKLYEEDLKDEHTRHEILHTTYRPRRRYPEKDVEILPVPNGEKRTVKVCLGT 
SCYTKGSYEILKKLVDYVKENDMEGKIEVLGTFCVENCGASPNVIVDDKIIGGATFEK 
VLEELSKNG 

73 



Attorney Docket Number: H2042101-CIP 
Inventor Haniscn F. Dillon 



SEQroNO 18: Y16775. gi:4034790 Nyctotherus ovalis 

5- 

MISRLIAKKAPLFLRTFATSEMISLKroGKJISWKGIMLADAIKKAGA>rVPTMCYHPD 
LPTSGGICRVCLVESAKSPGYPIISCRTPVEEGMEIVTQGSKMKEYRQANLALMLSRH 
PNACLSCTSKmCKTQELSANMNIGQCGFANATPPKNDDSYDMTTAIERDNDKCINC 
DICVHTCSLQGLNALGFYNEEGHAVKSMGTLDVSECIQCGQCINRCPTGAITEKSEIR 

10 PVLDAIMQQRLWQMAPSIRVAVAEEFGIKPGEKILKNEIATALRKLGSNVFVLDTNF 
SADLTHEEGHELIERLYRNVrGKKLLGGDHMPlDLPMLTSCC 
PGWIMFffiKlSIYPDLLNNLSTCKSPQGMLGALIKGWAKNIKKMDPKDIVSVSIM^ 
AKKAEKERPQLRGDEGYKDVDYILTTRELAKMLKQSNIDLAKMEPTPFDKVMSEGT 
GAAWGWGG\^AALRTANEVITGREWFKNLNIEAVRGMEGffiEAGIKLENVLD 

15 ICi^KAFEGVTVKVAIAHGPNNARKVMDHKQAKESCaCPAPWHFVEVMACPGGCIGG 
GGQPKPTM.EmQARTQLTFKEDMDLPLRKSHDNPEIKAIYENYLKEPLGHNSHHYL 
HITYSSQKVRDMNLYNANEAAGLDEILAKYPKEKEYLMPinEEHDKKGYISDPSIVK 
ISEHLGMYPAQIESILSSYHYFPREHTIAILMSICVHCHNCMMKGQGRLLKTIQETYDI 
HETHGGVAKDGSFTLHTLNWLGYCVNDAPAMMIKRKGTNYVETFTGLLGDNIDQR 

20 LKSLKNLKKELPKWPKNNIREMKSQRNGNSYSCMNTQAPIAEATKKAVSMGPEKVI 
EEVFKSNLVGRGGAGFRTGKKWESAYKTPASDKYWCNADEGLPSTYKDWCLLNN 
EAKRKEVFTGMGICAKTIGAKRCFMYLRYEYRNLVPALEQSIKDVQSTCPELADLKY 
EIRLGGGPYVAGEENAQFESreGRAPLPRKDRPGNIFPTMEGLFHKPTVINNVETFFAI 
PHHQQGSQSFGEGKMPKLLSVTGDVDEPILffiTI^NWSLNHLLQEISAKDrVAAEIG 

25 GCTEPUFGSKFDTLFGFGRGTLNAVGSWLFNSSCDLGKIYENKLKFMAEESCKQCV 
PCRDGSYIFHRAFKELRDTGKSSYNMRALAVASESAARSSICAHGKALESLFKSACDF 
MNKTKPIYQPHSTYHQ 



30 

SEQ ID NO 19: AF262402 Spironucleus barkhanus GI:1 1 127703 
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MKVRQSPFKrormGProRhTOAIQroYQKCIGCQMCAKTCTDSQNFNIFKISAPKTK^ 
VNAYGSVAEGTERNALAGTDCTGCGACVRACPVEALMPAFNIRPVLEPISEKKKVTI 
AVIAPSTRVGLAEGMGMGVGVTAERQMVYELKQMGFDYVFDNMWGADAPTTEDA 
KEILKAICAAGKTAITSCCPAWVKLVEriYPELLPNISSARSPHGIICSVIKJCYTA^ 
5 KKADELYWOVMPCTAKKNEAARKELTTOGSPDCDISITTRELMAYLKEKKV^ 
AREIELKD^^VQAQYDAPFNTFSGSAYIYGKTAGVTEAVVRYVCAIKKVPFSVGMITK 
ELIWENKLHSSSLTLLTFSAAGEDYRICVSYGGLAAHKAVELYKSGELKVDAVEVM 
VCPGGCVGGGGQPKQPKKDMILKRHEGLDKHDKEAPYSNCTENPTLNEFYERIGTD 
VHHVMHTTYSAYK 

10 

SEQroNO20: Ul 9897. Trichomonas vaginalis gi:l 171116 

MLASSATAMKGFANSLRMKDYSSTGINFDMTKCINCQSCVRACTNIAGQNVLKSLT 
1 5 VNGKSWQTVTGKPLAETNCISCGQCTLGCPKFTIFEADAIWVKEVLTKKNGRIAVC 
QIAPAIWNMAEALGWAGTISLGKVVTALKRLGFDY^FDTNFAADMTIVEEATELVQ 
RLSDKNAVLPMFTSCCPAWVNYVEKSDPSLIPYLSSCRSPMSMLSSVIKNVFPKKIGT 
TADKIYlsrVAIMPCTRKKDEIQRSQFTMKDGKQETGAVLTSRELAKMIKEAK^ 
PDTPCDNFYSEASGGGAIFCATGGVMEAAVRSAYKFLTKKELAPIDLQDVRGVASG 
20 \^AEVDIAGTKVKVAVAHGIKNAMTLIKKIKSGEEQFKDVKFVEVMACPGGCVVG 
GGSPKAKTKKAVQARLNATYSIDKSSKHRTSQDNPQLLQLYKESFEGKFGGHVAHH 
LLHTHYKNRKVNP 



25 

SEQ ID NO 21: U26964. Trichomonas vaginalis gi:1345093 

MLASSSRAAANIRWVDTSHNAIAFDMHKCmCQAC\^CKNVAGQSVLKSVKI^ 
KKKGWQTVTGKLLAETNCIGCGQCILVCPTQAIHEKDAIXQMNNIFK^ 
30 CQIAPAIRINMRRPWCSSRNSFHRQSRYSPQRLGFDYVFDTNFGADLTIVEEATELLQ 
RLNDPKAVLPMFTSCCPAWVlSrV^KSYPQWMPHl^TCRSPIGMI^AVIia^ 
VDPKRIFSVGIMPCTAKKDEAAREQLMTKSGLHETDLDrrSRELAKJVlIKAAK^ 
PDTELDSPYAMATGGGAIFCATGGVMEAAVRSAYKFATGKELAPIEFVQVRGAEKGI 
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KVGTVDINGREIKVAVAQGVKNAMSLIKKIEEGQDDVKGVVFCEVMACPGGCVGG 
GGSPRAKTXAAMNKRIXJATYWDRASKYRTPQDOTQLQDLYNAT^^ 

5 

SEQ ID NO 22: AF262401 . Tridiomonas vaginalis gi: 1 1 1 27700 

ASTGINSTANILRNITVTVNGKPLEAKKGETVLELCDRNMRIPRLCFHPNLPPKASCR 
VCL\^CDGKWLSPACVTIWDGLKIDTK5KJ>miDSVENNLKELLDCHDETCSACIA 

10 NHRCQFRDMNTVAYSVKAETKEICSEEGIDESTNAIRLDTSKCVLCGRCIRACEEVAGT 
SADFGNRAKKMRIQPTFGVTLQETSCIKCGQCTLYCPVGAITEKSQVKEALDILANKG 
KKnVVQVAPAWVALSEAFGYKEGTVTTGKMVSALKALGFDLVYDTNYGADLTIC 
EEAGELVNRLW)PNAKFPMFTrCCPAWVNYVEQSAPDFIPNLSSCRSPQGMLSALIK 
NYLPKlLDVKQEDVLNFSIMPCTAKJa>EVERPELRTKSGLKETDNmTVRELVEM 

1 5 LSNroF^mLPDTQFDNIFGFGSGAGQIFAATGGVMEAASRTAFE^^n"GKlaT^m^I^ 
VRGMIXjLRIAELDLIXnTCIXVAVCHGIA]SITAKLU)RLREKDPELMDIia?IEIMA 
GCVCGGGTPQPKNRVSLDNRLAAIYNIDAKMECRKSHENPLIKGVYKEFLGKPNSHL 
AHELLHTHFKHHPKW 

20 



SEQ ID NO 23: U15277 Tridiomonas vaginalis gi:557063 

25 MKTIILNGNEVHTDKDITILELARENNVDIPTLCFLKDCGNFGKCGVCMVEVEGKGF 
RAACVAKVEDGMVINTESDEVKERIKKRVSMLLDKHEFKCGQCSRRENCEFLKLVI 
KTKAKASKPFLPEDKDALVDNRSKAIVroRSKCVU:GRCVAACKQHTSTCSIQFIKKD 
GQRAVGTVDDVCLDDSTCLLCGQCVIACPVAALKEKSHIEKVQEALNDPKKHVIVA 
MAPSVRTAMGELFKMGYGKDVTGKLYTALRMLGFDKVFDINFGADMTIMEEATEL 

30 LGRVIOSINGPFPMFTSCCPAWVRLAQNYHPELLDNLSSAKSPQQIFGTASKTYYPSISG 
LAPEDVYTVTIMPCNDKKYEADIPFMETNSLRDroASLTTRELAKMIKDAKIK^ 
DGEVDPAMGTYSGAGAIFGATGGVMEAAIRSAKDFAENKELENVDYTEVRGFKGIK 
EAEVEIAGNKLNVAVINGASNFFEFMKSGKMNEKQYHFIEVMACPGGCINGGGQPH 
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VNALDRE^m)YRKLRASVLYNQDKNVLSKJlKSHDNPA^KMYDSYFGKPGEGLAH 
KLLHVKYTKDKNVSKHE 



5 SEQroN024: AF289201 Chlamydomonas reinhardtii gi:9837539 

MSALVLKPCAAVSIRGSSCRARQVAPRAPLAASTVRVALATLEAPARRLGNVACAA 
AAPAAEAPLSHVQQALAELAKPKDDPTRKHVCVQVAPAVRVAIAETLGLAPGATTP 
KQLAEGLRRLGFDEVFDTLFGADLTIMEEGSELLHRLTEHLEAHPHSDEPLPMFTSCC 

1 0 PGWIAMLEKJSWDLIPYVSSCKSPQMNILAAMVKSYLAEKKGIAPKDMVMVSIMPCT 
RKQSEADRDWFCVDADPTLRQLDHVITTVELGl^KERGINLAEU>EGEWDlSfPMGV 
GSGAGVU^GTTGGVMEAALRTAYELFTGTPLPRLSLSEVRGMDGIKETMTMVPAPG 
SKFEELLKHRAAARAEAAAHGTPGPLAWDGGAGFTSEDGRGGITLRVAVANGLGN 
AKiaiTKMQAGEAKYDFVEIMACPAGCVGGGGQPRSTDKArrQKRQAALYNLDEKS 

15 T^RRSHENPSIRELYDTYLGEPLGHKAHELLHTHYVAGGVEEKDEKK 



SEQIDN025: Aj298228 ChlorellafuscaGI: 18073435 

20 AGPTSECDCPPTPQAKLPHWQQALDELAKPKESRRLMIAQIASAVRVAIAETIGLAPG 
DVTIGQLVTGLRMLGFDYVFDTLFGADLTIMEEGTCLLHRLQDHLEQHPNKEEPLPM 
FTSCCPGWAMVEKSNPELIPYLSSCKSPQMMLGAVIKNYYAQQVGVQPSDICNVS 
VMPCVRKQGEADREWFNTTGAGLARDVDHVVTTAEVGKIFLERGIKLNELPESNFD 
NPIGEGTGGALLFGTTGGVNffiAALRTVYEVVTQKPMGRVPFEEVRGLEGIKEAEITL 

25 KPGDDSPFKAFAGADGQGnLIOAVANGLGNAKKLIKSLSEGKAKYDFIEVMACPGG 
CIGGGGQPRSTDKQn.QKRQQAMYNLDERSTIRRSHDNPnQALYDKFLGAPNSHKA 
HDLLHTHYVAGGIPEEK 



30 

SEQ ID NO 26: AY055756. Chlamydomonas reinhardtii GI: 18026272 

MALGLLAELRAGQAVACARRTNAPAHPAAVVPCLPSRAGKFFNLSQKVPSSQSARG 
STIRVAATATDAVPHWKLALEELDKPKDGGRKVLIAQVAPAVRVAIAESFGLAPGA 
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VSPGKLATGLRALGFDQWDTLFAADLTIMEEGTELLHRLKEHLEAHPHSDEPLPMF 
TSCCPGWAMMEKSYPELIPFVSSCKSPQMMMGAMVKTYLSEKQGIPAKDrVMVSV 
MPCVRKQGEADREWFCVSEPGVMJVDHVITTAELGNIFKERGINLPELPDSDWDQPL 
GLGSGAGVLFGTTGGVMEAALRTAYEIVTKEPLPRLNLSEVRGLDGIKEASVTLVPA 
5 PGSKFAELVAERLAHKVEEAAAAEAAAAVEGAVKPPIAYDGGQGFSTDDGKGGLKL 
RVAVANGLGNAKKLIGKMVSGEAKYDFVEIMACPAGCVGGGGQPRSTDKQITQKR 
QAALYDLDERNTLRRSHEISIEAVNQLYKEFLGEPLSHRAHELLHTHYVPGGAEADA 



10 SEQIDN0 27: AF276706. ScenedesmusobliquusGI:12581498 

PHWQQlLDELAKPKERKVMIAQIAPAVRGIAETMGLNPGDVrVGQMVTGLRMLGF 
DYVFDTLFGADLTIMEEGTELLHRLQDHLEQHPNKEEPLPMFrSCCPGWVAMVEKS 
NPELIPYLSSCKSPQMMLGAVnOSIWAAEAGAKPEDICNVSVMPCVRKSCffiAEPRSG 
1 5 STHHRA(mRDVDHVMTTAELGKlFVERGIKLNELQESPFDNPVGEGSGGGLLFGTTG 
GVMEAALRTVYEWTAEALGPQRSSLTTSTAWTPAQRASPRPSPQAPTAPSRPLQAQ 
TESGITLNIAVANGLGNAKKLIKQLAAGESKYDFTEVMACPGGCIGGGGQPQRNKQI 
LQKRQAAMYDLDERAVIRRTENPLIGALYEKFLGEPNGHKAHELLHTHYVAGGVPD 
RRSEGEAW 

20 



SEQ ID NO 28: NC_003869 Thermoanaerobacter tengcongensis strain MB4T gi:20806542 

25 MDKWVTIDGITVEWSYYT^EAAKEAGroiPTLCYLKEINQIGACRICLVEIEGVRN 
LQTSCTWWIXjMKVYTNTPKIREARRLNLELII^hlHDRNCLTCVRSTNCELQA 
RLGVEEIRFEGENIKYPIDDASPAVVRDPNKCVLCRRCVAVCSEVQNVFAIGMVNRG 
FKTMVAPSFGRSLKDSPCISCCKJCIMVCPVGAIYEKDHTKRVYEALADDKKYVVAQ 
TAPAVRVALGEEFGMPVGTIVTGKMAAALRRMGFDAVFDTNFAADLTIMEEGSELL 

30 ERIKHGGKLPMITSCSPGWIAFCEKYYPEFIDNLSTCKSPHMMMGALVKSYYAEKKG 
LDPKBIFVVSIMPCTAKKLEIEREEMIRNGMKD\a)AVLTTRELARMIKEMGIDFV>IL 
KDEEFDEPLGMSTGAGAIFGATGGVMEAALRTVAEIVEGRDIGKIDFEEVRGLEGVR 
EATITIDGIVIDIKIAIANGTGNAKKLLDKVKAGEVEYHFIEVMGCPGGCIMGGGQPIH 
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NPNEMEEVKKLRAKAIYEroK]^PIRKSHENPAIKRLYEEFLGYPLSEKSHELLI^^ 
SRKELYPLVK 



5 

SEQ ID NO 29: AY033895. Neocallimastix frontalis hydrogenase gi: 19547862 

MSMLSSAa.NKAVVNPKLTRSLATAAAEKMWISINGRKFQVKPKTTVLEAAKANGY 
YIPTLCYHQELPVAGNCRLCLVYAKGSWKPLTACTTEVWEGMEIETDSPAVIETVRS 

1 0 SLSMMREEPff NDCMTCGSNGIX:EFQDLIYRYQroAKHPVR5LLKHKSKKTNHSITEP 
CYSPFDNTTFSVARDMNKCVKCGRCIRACHHFQNINILGFINRAGYERVGTPMDRPM 
NFTKCVECGQCSQVCPVGAITARTEWDVLRHLDTKRKVWCSTAPAIRVAPAEEFS 
TEADFDFTGKMVAGLRKLGFDYITOTNFSADLTIMEEGTELIDRLNNGGKFP^ 
CPGWINMVEKSYPELSDNLSSCKSPQQMIGAVIKSYFAKKLGLSTEDIIHVSIMPCTA 

1 5 KKGEARRPEFVQKGKDGKDYPOroYVITTRELLTLLKLKKESfPAELPDDKFDSPLGIG 
SSAGNLFGVTGGVMEAAmTAQWGVENPIPLGELKAIRGLDGIKAANVPLKTKDG 
KEVSVRAAWSGGANIQKFLEKIKNKELEFDFffiMMMCPGGCINGGGQPKSADPErV 
AKKMQRMYmOlXiAKLRLCHENPEnDWBCNFLGEPNSHIAHELLHrHYlWRSKTI 
HDMGHHEKK 

20 



SEQ ID NO 30: AF446076. Piromyces sp. E2 GI:19548180 

25 CLVDVKGSWKPLTACTTEVWEGMEIETDTPAVRETVRSSLAMMREEHPNDCMTCES 
NGNCEFQDLPyRYQroAQHPVRTLLRNKFKKTNHSmEPCYSPFDDSTFSISRDMNKC 
VKCGRCVRACHHFQNINILGFINl^GYERVGTPMDRPMNFTKCVECGQCSQVCPVG 
AFTERNECIEVLRHLDTKRKIVVVSTAPAIRVALAEEFNAEPDFDFTGKMVAGLKKLG 
FDYIFDT^IFSADLTIMEEGTELITRLNEG<aCFPMF^SCCPGWINM\^KSYPEIRD^^ 

30 CKSPQQMIGAVIKTYFAKKmAKPEDIIHVSVMPCTAKKGEAKRPEFKRDGVPDIDHV 
ITTRELriXLKLKRINPSELKNEKFDSPLGIGSSAGNLFGVTGGVMEAAVRTAQIITGV 
ENPIPLGELKAIRGLDGIKJVASVPLKTKDGKDVNVRAAVVSGGANIQKFLEKLKKKE 
LEFDFVEMMMCPGGCINGGGQPKSADPKVVAKKMERMYTMDDQASLRLSHENPEI 
TQIYKEFLKEPNGHLSHELLHTHYNDRSKAIQDMSLHQK 
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SEQ ID NO 3 1 : AF51 6683. NeocaUimastix frontalis GI:23664246 

TERNEVIEVLRQLDSKMaLVCSTAPAmVALAEEMADPDFNrrGKMVAGLRKLCT 

DYIFDTNFSADLTIMEEGTELINRLlSINGGKFPMFrSCCPGWINMVEKSWELREN^ 

CKSPQQMIGALIKSYFAKKLGVSTEDIIHVSVMPCTAKKGEAKRPEFVQKGKDGKNY 

PDIDYVLTTRELLTLMKIKKVNPAELADDKLDSPLGISSSAGNLFGVTGGVMEAAVR 

TAQIITGVENPIPLGELKAVRGLEGIKAATVPLKTKEGKDINVRAAWSGGANIQKFL 

EWKNKEWDFVEMMMCPGGCINGGGQPKSADPKIWKXMQRMYTMDEQATIJ^ 

SHENEEVKQIYKEFLIEPNGHLSHELLHTHYNDRSKAIQDMSLHEKK 



SEQ ID NO 32: NZ_AABIO 1000003. Desulfovibrio desulfuricans G20 GI:23474435 

MNGQQNVIRIDSDICTGCGRCKDVCPVGAVEGVQGTPHSIREDVCVLCGQCVQQCS 

AFASFYEQHPACIAEKKRERGLFVSEAAPLFAAWHTGDAPRVAGRLAEGCHSMVQC 

APAVRAAIGEEFGMPAGALTPGRLAAALIUILGFDRWDTNFAADLTIMEEGSELLQR 

MEGAGPLPMFTSCCPAWVRYAEQQFPDLLEHLSSCKSPQQMAGAVFKSYGAQLDG 

VDPRQVFSVAVMPCTCKKAEAQRPGMEHDGVRDVDAVLTTGELAAMLRQAHDFA 

ALPDEPFDRPLGSYSGAGNIFGLTGGVMEAALRTAYELVTGEPVPCTELVYVRGGEG 

IRHATLTMDGRTFRVAWAGLQHVRPLLEAVRAGTCDVNFVEVMCCPQGCISGGGQ 

PKVLLPFQRDEVYAARKAALYRHDAELACRKSHENPQVQALYREFLGEPLSHVSHN- 

LLHTVYGQTR 



SEQ ID NO 33: NZ_AABB01 000250. Desulfitobacterium hafiiiense GI:23112964 

NIMQLKHPFQSGFQQQSCKRfflTOCVVVDMESKAGKGSNLSRRSFLKFAGGAGIAGA 
SLSLTGCGQPLTPASAVGGEGWMPTQYNEPGGWPTNVRGRVPIDPENPALRRDDQK 
CILCGQCIEVCKTIQSVYGNYELPLKNEIPCINCGQCIHWCPSGAISEREDIDQVAKAL 
ADPKITWVQTAPATRIGLGEEFGLPVGTNVQGKQVAALRKLGFDVIF 
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DTMfAADLTIMEEGTELVKlUTGELHHPLPQFTSCCPGWVKFVEYYYPELLPNLSSAK 
SPQQMAGALVKTYFAEKNHVEPQKIFSVAIMPCTAKKFECQRPEMISAQTYWQDEQ 
VSPDVDWLTTRELARMIKRAGIDLPSLPDEEYDQLMGVATGAGAIFGTTGGVMEA 
AVRSAYYLVTGEQPPAALWQLTPVRGMEGVKEAAVSEPGAGEIRIAVISGLDNARAI 
5 MEQVKAGNSPWTFffiVMACPGGCQYGGGQPRSSAPPSDGVWSrniAASLYKIDAQAK 
LRNSHDNPQIKQVYAEFLTSPLSEKJ^ELLHTHYISRAEEFDAKKPQSHEYEV 

SEQroN0 34: AJ312124. Eubacterium addaminophilum GI: 14250935 

10 

MVNmOGRQVTVPANSTVLDAARDMGINIPTLCYLKDINKTGACRMCLVEVEGIRN 
LQTACTFPVRDGLArV^TNTKR\^ARRDNLQLILSNHHRDCLSCFRNGSCELQALC 
DDMCaLSELDFEAPKELKPVDMLSHSIVRDPNKCILCGRCVAVCNKVQEVGILAFTNR 
GVETEVAPAFATSMADAPOTCGQCVNVCPVAALREKTDffiKVWEVLEDETKHVVV 

1 5 QVAPAVRAALGEMFGNPIGTRWGKMFTALKMLGFQKVFDTNFAADLTIMEEGTEL 
LGRIKNGGTLPMITSCSPGWIRYVEHFYPELLDHVSSCKSPQQMMGAVLKSYYAEKN 
NIAPENMIWSVMPCIAKKTESAKEEMKNXaiGTRDVDWLTTRELGKMIKEARIEFN 
DLQDSNPDEFFGDYTGAAVIFGATGGVMEAAIRTVADIVSGQELEDIEYTAVRGLEGI 
KEAAVKIGDLEVKVAVAHGTANAGKLMDLVRDGKADYHFIEIMGCSGGCVTGGGQ 

20 PHVDSRTKEKVNVKLERAKALYTEDKLRDKRKSHHNESWJILYEEYLGKPNGHKA 
HELLHTHYKKRELF 



25 SEQ ID NO 35: NZ_AAAF01000001. Rhodopseudomonas palustris GI:22962132 

MCTPDQASLSARDPAEATITLSmGVACAGFANETILSCARRYDVYIPTLCELEDIDHT 
PGACRVCL\nEILQAGKDTPQIVTACNTPVRDGMEVQTRSKKARDMQRLQVELLMA 
DHLQDCATCIRHGSCELQDLAQFVGLQQNRFFDRERTEARPVDHSSPSMVRDMRRC 
30 VRCQRCVAICRYHQKIDALAIEGSGLERMVALRDADGYPNSVCVSCGQCVLVCPTG 
ALGERDETDRALDYICDPNVVTVVQFAPAVRVAFGEEFGLPAGTNVEGQIIAACRKL 
GVDVVLDTNFAADWIMEEGAELLARLKQGRRPTFTSCCPAWINFAEIHYPDVLPLL 
SSTKSPQQVLSTIAKSYLPAQLGVPAERIRVISIMPCIAKKDEAVRPQMVHDGQPETD 
LVLTTREFARLLRREGIDLKDLPSSQFDRPFLSAYSGAGAIFGTTGGVMEAAVRTIYA 
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LVNGRELERIELTQLRGFEGLREATVDLGAPVGEVKVAMVHGLGDTRKLVESVLSG 
EANYDITEVMACPGGCVDGGGSLRSKKAYLPLALFaUlETIYN\aDRAAKVRQSH^ 
QVQALYRELLQAPNSEIAHRLUmry^ASRKRELQHTVKEIWDDLTMSTILY 

SEQ ID NO 36: NZ_AAAJ01 000020. Clostridium thermocellum GI:23021053 

MDSFLMKGYIKEANroYSCSRGSMEDLPKWEFREIPKW]^VMPSLSLEEWC>nSlFNE 
10 VELGLSEEVARKEARRCLKCGCSARFTCDLRKEASNHGIVYEEPIHDRPYIPKVDDHP 
HVRDHNKCISCGRCIAACAEIEGPGVLTFYMKNGRQLVGTKSGLPLRDTDCVSCGQ 
CVTACPCAALDYRRERGKVVRAINDPKKTVVGFVAPAVRSLISNTFGVSYEEASPFM 
AGLLKKLGFDKVFDFITAADLTIVEETTEFLSMQNKGVMPQFTSCCPGWINFVEKRY 
PEnPHLSTCKSPQMMMGATVKNHYAKLMGINKEDLFVVSrVPCLAKKYEAARPEFI 
15 HDGMVDAVLTTTEMLEMMELADIKPSEVVPQEFDEPYKQVSGAGILFGASGGVAE 
AALRMAVEKLTGKVLTDHLEFEEIRGFEGVKESTIDVNGTKVRVAWSGLKNAEPIIE 
KILNGVDVGYDLIEVMACPGGCICGAGHPVPEKIDSLEKRQQVLVMDKVSKYRKSQ 
ENPDILRLYNEFYGEPNSPLAHELLHTHYTPKHGDSTCSPERKKGTAAFDVQEFTICM 
CESCMEKGAENLYNDLSSKIRLFKMDPFVQIKRIRLKETHPGKGVYIALNGKQIEEPM 
20 LSGNIPDESESE 



SEQ ID NO 37: AB035092 Clostridium perfrin^ns GI:7959055 

25 MNKHINDKTIEFDGDKTILDLARENGFDIPVLCELKNCGNKGQCGVCLVEQEGNDRL 
LRSCAIKAKIXiNWIKTDSEKVLEARKERVAELLDEHEFKCGPCKRRENCEFLKLVIK 
TKARAHKPFVVADKSEYVDDRSKSrVLDRSKCVKCGRCVAACRTRTATNSIKFHRID 
GVRLVGPEELKCFDDTNCLLCGQCIAACPVDALSEKSHIERVQEALNDPEKHVIVAM 
APAVRTSMGELFKMGYGQDVTGKLYTALRELGFDKVFDINFGADMTIMEEATELIE 

30 RIKNNGPFPMLTSCCPSWVREVENYFPELVENLSSAKSPQQIFGAASKTYYPQVADID 
PKKVFTVIYMPCTSKKFEADRPEMEl^GIRMDAVITTRELARMIKAAKIDFAKLEDG 
EVDPAMGEYTGAGVIFGATGGVMEAALRTAKDFMENDNLDNVDYEAVRGLAGIKE 
AEVEIAGNEYKLAVVSGAANVFELVKSGKINDYHFIEVMACPGGCVNGGGQPHISAE 
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DSDKMDIREVRASVLYNQDKNLEKRKSHQNSALLKMYESYMGKPGHGRAHELLH 
MKYKK 



5 

SEQIDN0 38: NZ_AAAJ01 000005. Clostridium thermocellum GI:23020350 

MHVLKLVHSTQYWAEEMDNREYMLIDGIPVEINGEKNLLELIRKAGIKLPTFCYHS 
ELSWGACRMCMVENEWGGLDAACSTPPRAGMSIKTm^RLQKYRKMILELLLAN 

10 HCRDCTTCNNNGKCKLQDLAMRYMSmWTNTASlWDVDDSSLCITia)RSK^^ 

CVRVasrEVQNVGATOFAYRGSKMTISTVFDKPIFESNCVGCGQCALACPTGAWVKD 
DTQKVWKEIYDKNTRVSVQIAPAVRVALGKELGLNDGENAIGKIVAALRRMGFDDI 
FDTSTGADLT\^EESAELLRRIREGK]SroMPLFTSCCPAWVNYCEKFYPELLPHVSTCR 
SPMQMFASIIKEEYSTSSKRLVHVAVMPCTAKKFEAARKEFKVNGVPNVDYVLTTQ 

1 5 ELVRMIKESGIVFSELEPEAIDMPFGTYTGAGVIFGVSGGVtEAVLRRVVSDKSPTSFR 
SLAYTGVRCMNGVKEASVMYGDRKLKVAWSGLKNAGDLIERIKAGEHYDLVEV 
MACPGGCINGGGQPFVQSEEREKRGK<a.YSADKLCNIKSSEENPLMMTLYKGILKGR 
VHELLHVDYASKKEAK 

20 

SEQ ID NO 39: NZ_AABIO 1000004. Desulfovibrio desulfuricans 01:23474512 

MAGCKAQHPPAAYLAGLEVPAAGSEVTMEGVRYKMNAPKDVDPATIRFVEVDHDK 
GMACGECEYHCPTGVMQEVTEDGYRGWDPVACVNCGQCLANCPFGAIHEEVSFV 

25 GELYEia,KDPDTVVVSMPAPAVRYALGECFGLPTGTYVGGQMHAALRRLGFNLVW 
DTEWTADVITMEEGTELLERVKHGNMPLPQFTSCCPGWIKFAETFYPDLEKHLSTCK 
SPIAJVOGPLAKTYGAQEAGVPAKKNTYIVSIMPCIAKKFEGMRPEMNASGYRDIDATI 
TTRELAWMKKAGIDFTSLPSEEPDPALGMSTGAATIFCTSGGVMEAALRLAYEALS 
GGTLADPDIKVVRTHEGINTAEVPVPNFGTVKVAVASGLDNAAKLCEEVRAGKSPY 

30 HFIEVMTCPGGC\^GGGQPLEPGMLQSSLFKSTITKINRRFTRRSVA 



SEQ ID NO 40: NZ_AABI01 000049. Desulfovibrio desulfuricans. GI:23475985 
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MNLVEMEKIQYVDQSPDPRANPDELFHQIDPEKCIGCDTCQEYCPTGAIFGDTGSAH 
SIPHEEICINCGQCLTHCPVGAIYEVQSWVRELSEKIKDPEIKVIAMPAPAVRYGLGEC 
FGMPVGTVTTGKMLTAIXJMLGFDHVWDNEFTADVTIWEEGTEFVK^ 
5 QFTSCCPGWHKYVESFYPELFPHLSSCKSFIGMMGALAKTYGPDVMKYDRSKVYTV 
SIMPCTAKKYEGMRADLWSSGYKDroATIDTRELAYMIKKAGIDFAALPDGKRDTL 
MGDSTGGATIFGVSGGVMEAALRYAYEAVTGKKPSSWDFTMVRGLNGIKEGTVTIG 
DAKINVAWHGAKRFAEVCEVIKTGKSPCISSSLCLPR 

10 

SEQ ID NO 41 : NZ_AABI01 000001 . Desulfovibrio desulfiiricans GI:23473140 

MNLVEMEKIQYVDQSPDPRANPDELFFIQIDPEKCIGCDTCQEYCPTGAIFGDTGSAH 
15 SIPHEEICINCGQCLTHCPVGAIYEVQSWVRELSEKIKDPEIKVIAMPAPAVRYGLGEC 
FGMPVGTVTTGKMLTALQMLGFDHVWDNEFTADVnWEEGTEFVKRLTGQIDKPLP 
QFTSCCPGWUKYVESFYPELFPHLSSCKSPIGMMGALAKTYGPDVMKYDRSKVYTV 
SIMPCTAKKYEGMRADLWSSGYKDIDATIDTRELAYMIKKAGIDFAALPDGKRDTL 
MGDSTGGATIFGVSGGVMEAALRYAYEAVTGKKPSSWDFTMVRGLNGIKEGTVTIG 
20 DAKINVA WHGAKRFAEVCEVIKTGKSPWHFIEFMACPGGCVCGGGQP VMPGVLEA 
MDRKVSRTFAGLKERLNRMSSSKA 



25 SEQ ID NO 42: AY028641. Trichomonas vaginalis GI:19547861 

CDGKWLSPACVTTVWDGLKTOTKSKNVRDSVENNLKELIJDCHDETCSACIA^^ 
FRDMNVAYSVKAETKEICSEEGIDESTNAIRLDTSKCVLCGRCIRACEEVAGTSAIIFG 
NRAKKMIUQPTFGVTLQETSCKCGQCTLYCPVGArrEKSQWEALDILANKGKKJTV 
30 VQVAPAVRVALSEAFGYKEGTVTTGKMVSALKALGFDLVYDTNYGADLTICEEAGE 
L\nm.RDPNAKFPMFTSCCPAWVNYVEQSAPDFIPNLSSCRSPQGMLSALIKNYLPK 
LLDVKQEDVLNFSlMPCTAKia)EVEIU>ELRTKSGPKETDMVLTVRELVEMIKLSNID 
FNNLPDTQFDNIFGFGSGAGQIFAAT 
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SEQ ID NO 43: AF446077. Trichomonas gallinae Ca:19548182 

CDGKWLSPACVTTVWDGLRIDTKSKWRDSVENNLKELLDCHDETCSSCVANHRC 
5 QFRDMNVAYSVKADTKEICSEEGIDESTHAIRLDTSKCVLCGRCIRACEEVAGTSAIIF 
GNRAKHMRIQPTFGGTLQETACIKCGQCTLYCPVGAITEKSQVKEALDILANKGKKV 
TVVQVAPAVRVALSEAFGYKEGTVTTGKMVSALKALGFDLVYDTNYGADLTICEEA 
GELVNRLKDPKAVFPMrrSCCPAWWy^VEQSAPDFIPl^SSCRSPQGlVn.SSLIKNYLP 
- KLLGIKQEEVMNFSIMPCTAKKDEIERPELQTKTGLKETDMVLTVRELVENIIKLSNID 
10 FNNLPDTPFDNIFGFGSGAGQIFAAT 



SEQ ID NO 44: Y16669 Nyctotherus ovalis 01:4029298 

15 

MISRLIAKJCJU'LFLRTFATSEMISLKIDGKIISVPKGIMLADAIKKAGANVPTMCYHPD 
LPTSGGICRVCLVESAKSPGYPIISCRTPVEEGMEIVTQGSKMKEYRQANLALMLSRH 
PNACLSCTSNTNCKTQELSANMNIGQCGFANATPPKNDDSYDMTTAIERDNDKCINC 
DICYHTCSLQGLNALGFYNEEGHAVKSMGTLDVSECIQCGQCINRCPTGAITEKSEIR 
20 PVLDAINIQQRLVFQMAPSIRVAVAEEFGIKPGEKILKNEIATALRKLGSNVFVLDTNF 
SADLTIffiEGHELffiRLYRNVTGKKLLGGDHMPIDLPMLTSCCPGWIMFIEKNYPDLL 
NNI^TCKSPQGMLGALIKGYWAKNIKKMDPKDrVSVSIMPCTAKKAEKERPQLRGD 
EGYKDVDYILTTRELAKMLKQSNIDLAKMEPTPFDKVMSEGTGAAVIFGVT 

25 

SEQ ID NO 45: AY028640 Trichomonas vaginalis, gi: 19547858 

CDGKWLAPACVTTVWDGLKTOTKSKMVKESVENNLKELLDCHDETCSSCVANHRC 
QFRDMNVAYSIKAETKEECSEEGIDESTNSIRLDTSKCVLCGRCIRACEEVAGQSAIIF 
30 GNRAKHMRIQPTFCK^TLQDTSCnCCGQCTLYCPVGAITEKSQVKQALDILSNKGKKlS 
VIQVAPAVRVALSEAFGYKEGSVTTGKMVSALKALGFDYVYDTNYS 
ADLTIVEEAGELVQRLKNPNAVFPMFTSCCPAWVNYVEQSAPDFIPNLSSCRSPQGM 
LSSLVKNYLPKVLNIPVEDVLNFSIMPCTAKKDEIERPELRTKDGHKETDMVLTVREL 
VEMIKLSGIDFNNLPDTPFDSIFGFGSGAGQIFAAT 
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SEQ ID NO 46: AF262400. Entamoeba histolytica GI: 1 1 127699 

5 RLmVTGHDHNHSIQFDWSKCMGCGMCATKCTFGVLVKQPPKIPPFVQPNREKLSQ 
ENTDKTRVLIDESECTGCGQCSLVCNFGSITPIDHLVDTFKAKEAGKKLVAMIAPSTR 
LGVAEAMGMPIGSTAMAQLVHCLRLIGFDYVFDVDAGADKTTMDDYAEVIEMKKE 
GKGPATTSCCPAWIELVEKEYPDLIPNVSTARSPIGCLAGCIKRGWAKDVGIAVEDLY 
TVGIMPCIAKKTESQRQQIHQDYDASCTSNEIAAYFKKHLPPEECKFTQEREEALAKT 
1 0 EDGQCDLPFRRISGGSNIFGRTGGVCETVLRVIARNAGVDWNSCTVNKEETFKHAAS 
GSTNfimSVDIGGTIITGAVCHGGYAIRHACELIRKGELKVDVVEMMACVGGCLGG 
AGQPKIPPAKiaEMDKRRVMLDILDQQTDIRAANElsrro\a.GWroKHroH(KjAHQm 
HTYFTPRYQN 

15 

SEQ ID NO 47: AF242293. Giardia intestinalis GI: 13506793 

MPPKPQHDVTGVDSNNAIMIDYAKCIGCNMCIKACDVQGIGVYKQNEKPKYPPIVKL 
20 STLFNSDCIGCGQCATICPVDAIAPKNNLEIYKGESASKKVRVALIAPSTRVAFGDVFG 
LPIGTNTIYSLIRMLKQYLGFDYVFDVNFGADETTVIDTQELLHFKHEGRGPVFTSCC 
PAWVNLCEMKYPELLPQVSTAKSCVAMVATLVKWRWVQEHLIPKGIVDSVDDVYV 
ADIMPCTAKKDESMRPQLNRDVDICLTVREVAEHLYFLHGARLTLEEVEADALVLRP 
GRSTQKKWDFDAPFNTVSGGSHIFGKTGGVAETCLRnSYMKKSPIENVKEELLKEFK 
25 TPGQLVQTVKLVSCEIAGETYRALIAHGGSAINAAARMVLNKEVECDVVEQMACPG 
GCQNGGGMPKIKGKKEAVLTRASTLDILDGKERFASAGENKTLWGFNGCLTEHEAH 
ELLHTHYQHRPVESLLPQ 



30 

SEQ ID NO 48: NZ_AABB01 000211. Desulfitobacterium haftiiense 01:23112567 

MVKIISITNNAKRQGKGTSRKEKQAMKEWKQQRIRVTVNGRQMEVYGDLTILQAL 
LQEDMPHLCYDIRLERSNGNCGLCWELGEGSEQQDYKACHTPIQEGMIIHTNSPRL 
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EHYRKIIU.EQILADHNADCVAPCV^^'CPA^ra)IQSYLSHAGNGNFETAIKVIKERN^ 
PIVCGRVCPHSCEAQCRRNLroEPVAINHVKRITADWDlAHEQPWAPRKKAATGKKI 
AWGAGSSGLSAAYYSAIQGHDVTVFERHPRAGGMMRYGIPEYRLPKETLDREIGLI 
ADLGVlaMT^a^ALGTH^U.EDLHQDFDAVYLAIGSWRATPIXJmGDNLEGVWLGI^ 
5 LEQVTKGADKLGEHVVVIGGGNTAroCARTALRKGAGSVKLAryRRTREEMPAES^ 
EVEEAmEGVEMYFLTAPHKWAEGGRKLLHCIKMTLGEPDRSGRRRPIPIEGSETAFE 
ADTnGAIGQSTNTQFLYHDLPVia,NKWGDIEINGKTMQTSEMNIFAGGDCVTGPAT 
VIQAVAAGRHAAEAMDSFLMKGYVKEQPMDYSCSRGSLEDLPQWEFEKIPRLKRAP 
MPALPPAERRDNFREVETGLSEETARAEARRCLKCGCYERYDCDLRQEASLHHVEF 
1 0 KKPVHERPYIPIVEDHSIIIRDHNKCIS CGRCIAAC AEVEGPDILSFYMKHGRQL VGTK 
SGLPLDQTDCVSCGQCVNACPCGALDYRSEIGRVFRAINDPGKTTVAFVAPAVRSW 
SSQYGVSYQEASRFIAGLLKKIGFDKWDFITAADLTIVEETTEFLTRLQSHKPIPQFTS 
CCPGWVNFVERRYPEIIPYLSSCKSPQMMMGATVKTTLRN 

15 

SEQIDN0 49: AJ420003 Nyctotherus velox GI:19572306 

ILFMEKNYPDMLNHLSTCK5PQGMLGALIKGYWAKNVKKIDPKDVVSVSIMPCTAK 
20 KEEKDRirLKSDEGYNNVDYVLTTRELAKMFKQSNIDPSKLPPTQFDNVMSEGTGAA 
VIFGVT 



SEQroNO50: AC091811. Oryzasativa GI:18087682 

25 

MASSSSSASSRFSPALQASDLNDFIAPSQDCnSLNKGPSARRLPIKQKEIAVSTNPPEE 
AVKJSLKDCLACSGCITSAETVMLEKQSLGDFITRINSDKAVIVSVSPQSRASLAAFFG 
LSQSQVFRKLtALFKSMGVKAVYDTSSSRDLSLDEACSEFVTRYHQNQLSSGKEAGK 
NLPMI^SACPGWICYAEKTLGSFILPYISAVKSPQQAIGAAIKHHMVGKLGLKPHDV 
30 YHXO'VMPCYDKKLEAVRDDFVFSVEDKDVTEVDSVLTTGEVLDLIQSRSVDFKTLE 
ESPMDRLLTNYDDDG<)LYGVSGGSGGYAETVFRHAAHVLFDRKIEGSVDFRILKNS 
DFREVTLEVEGKPVLKFALCYGFRNLQNIIRKIKMGKCEYHFIEVMACPSGCLNGGG 
QIKPAKGQSAKDLIQLLEDVYIQDVSVSNPFENPIAKRLYDEWLGQPGSENAKKYLH 
TKYHPWKSVASQLQNW 
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SEQIDN0 51: AJ420004. Psafteriomonas lanterna GI: 19572308 

5 INLVEKHYPEYLPNLSSCRSPQGMLSSLIKNYWAKKMGffiPKDVVVVSFMPCGAKK 
DEIKRPQLKGETDYVLTTRELGKLFKMGGLNDLSVLEPVKYDDPLGESTGAAVIFGA 
T 

10 SEQIDN052: AJ420001. Nyctotherus ovalis GI: 19572302 

IMFMEK^fYTDMLNHLSTCKSPQGMIXJALIKGYWAKMKmDPKDIVSVS^ 

KAEKERPQLRGDEGYTa)VDYILTTRELAK]Va.KQShra)LGKMEPTPFDKVMSEGTGA 

AVIFGVT 

15 

SEQIDN0 53: AJ420002. Nyctotherus ovalis GI:19572304 

IMFMEK>rVTDMLNHLSTCK5PQGMLGALIKGYWAKNVKKMDPKDrVSVSIMPCTA 
20 KKAEKERPQLRGDEGYKDVDYILTTRELAKMLKQSNIDLGKMEPRPFDKVMSEGTG 
AAVIFGVT 



25 SEQ ID NO 54: NZ_AAAG01 000003 Rhodospirillum rubrum GI:22966612 

MRPVQRPRRWPGLRQRLSPERPVDRRSRRRSGAARPGRRRGSGVQHEILRSVSQRD 
MSMSIQPTVnDPELCTGCGRCVETCPVQAIAGSRGKAHEIEAAACVSCGRCVATCA 
AFDSIFDAFPTPRPVRLKRRGLPGSLKEPLFAAHDPSRIEAVRKAFATPKRMTVMQVD 
30 TMACVALAEDFGLPPGSLSPLKIASAARQLGFDRVYRTSFPAGLAVLETAHEMAARL 
ANGGNLPVINSSCPAVVAFLERRYPELLHYLSTVKSPHQIAGALYNSYLADAANLAP 
AMHKVSVVACLSHKAEAERPEMNfTCGCPDIDTVLTARELAILIKDAGIDVPLLGDGE 
FDNDFPEIEGLDTLYCAPGDVSRAVLGAGRWFLGQGEGVGAPAGETVEVLDEATRL 
TRLAYPGGTLQALTVAGFDKAVPYLEAIKAGRNAFQFLEIASCPQGCASGAGLPKVL 
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LETEOARYRARIENLPPAAPEAWSRIJ>GHPSIVALYGGYTGKAIGDKSNRItLHTQY 
AEPAAAP 



5 

SEQIDN0 55: NZ_AABB01000211. Desulfitobacteriumhafiiiense GI:231 12569 . 

MAVEKLTGEVLTDQLDYQEVRGLQGIKEAAVEAKGKKVNVAVISGLHNVEPILEKII 
EGMEVGYDLIEVMACPGGCICGAGHPVPEKIDTLEKRQQVLVNroQTSRYRKSQENP 
10 DILRLYDEYYGEANSPLAHKlLHTHYEAVKJlEPVAKJiDRRMADSARrmELTLCTC 
DKCTAQGSRELFAALSGmKLKMDSFVTARTIRLKENHPGQGVYAAIDGKLIETPVE 
QLEQRIFQHLIR 

15 SEQIDN056: NZ_AABBO 10002 11 Desulfitobacterium hafhiense GI:231 12568 

NfVSrVn>CIAKKYEAARPEFRSEGIRDVDAVLTSTEMLEMADIKLIEPADVEPQDFCEP 
YKRVSGAGILFGASGGVAKRPCGWRWRN 

20 

SEQ ID NO 57: AY071338. Drosophila melanogaster GI:1 7945827 

MSRLSRALQLTDIDDFITPSQICIKPVQIDKARSKTGAKIKIKGDGCFEESESGNLKLNK 
VDISLQDCLACSGCITSAEEN^ITQQSREELLKVLQENSKNKASEDWDNWTTVFTLA 
25 TQPBLSLAYRYQIGVEDAARHLNGYFRSLGADYVLSTKVADDIALLECRQEFVDRYR 
ENENLTMLSSSCPGWVCYAEKlllGNFLLPYVSTTRSPQQMGVLVKQILADKMNfVP 

ASRIYH\nrVMPCyT)KKLEASREDFFSKANNSRDVDCVITS\^VEQLLSEA(K?PL^ 
DLLDIJ>WPWSNVRPEFMWAHEKTLSGGYAEHIFKYAAKHIFNEDLKTELEFKQLK 
NRDFREIILKQNGKTVLKFAIANGFRNIQNLVQKLKREKVSNYHFVEVMACPSGCIN 
30 GGAQIRFTTGQHVRELTRKLEELYQNLPRSEPENSLTKHIYNDFLDGFQSDKSYDVLH 
TRYHDVVSELSISLNINW 
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SEQ ID NO 58: AL049559 S. pombe GI:4581520 

malo^vndlndflspgavcikpaqvkkqeskl^irrogdayyevtkd^^ 
sislnix:lacsgcitsaetvlvnlqsyqevlkhlesrksqeilyvslspqvranlaay 

5 yglslqeiqavlemvfigklgfhaildtnasreivlqqcaqefcnswlqsrahknqn 
qvtnswnehpliphstsqisgvhsntssnsginenavlpilssscpgwicyvekthsnl 
ipnlsrvrspqqacgrilkdwavqqfsmqrndvwhlslmpcfdkkleasrdefsen 
gvw5vdsvltpkelvemfkflridpieltknpipfqqstdaipfwyprrryeeqigsssg 
gymgyvlsyaakmlfgiddvgpyvsmnnkngdltwflrhpetneqlismatcyg 

10 frniqnlvrrvhgnssvrkgrvllkkrvrsnaqnpteepsrydyvevmacpggcin 
gggqlpfpsvemsardwmqqveklyyepgtrsvdqsavsymleqwvkdptltpk 
flhtsyravqtdndnplllankw 



15 



SEQ ED NO 59: AJ419999. Metopus coirtortus 01:19572298 



IIFAEKNYPElVrVWlLSTTKSPMQMLSSLSKGYWAKEGKiaDPKNYVNVAINff 
KAWKERPDMKADNGDPVTDYVLTTRELGTMLRQSNINPVSLPKTPFDKIMGESTGA 
20 AVIFGAT 



SEQ ID NO 60: AK076232 Mus musculus 01:26345230 

25 

MKCEHCTRKECSKKSKNDDQENVSSDGAQPSDGASPAKESEEKGEFHKLADAKIFLS 
DCLACDSCVTVEEGVQLSQQSAKDFLHVLNLNKRCDTSKHRVLVVSVCPQSLPYFA 
AKFNLSVTDASRRLCGFLKSLGVHYVFDTTIAADFSILESQKEFVRRYHQHSEEQREL 
PMLTSACPGWVRYAERVLGRPnPYLCTAKSPQQVMGSLVKDYFARQQNLSPEKIFH 
30 VVVAPCYDKKLEALREGLSTTLNGARGTDCVLTSGEIAQIMEQSDLSVKDIAVDTLF 
GDMKEVAVQRHDGVSSDGHLAHVFRHAAKELFGEHVEErrYRALRNKDFHEVTLE 
KNGEVLLRFAAAYGFRNIQNMIQKLKKGKLPYHFVEVLACPRGCLNGRGQAQTEDG 
HTDRALLQQMEGIYSGIPVRPPESSTHVQELYQEWLEGTESPKVQEVLHTSYQSLEPC 
TDGLDIKW 
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SEQIDN061: NM_065691 Caenorhabditis elegans GI:25 148541 

MEDSGFSGWRLSNVSDFIAPNLDCnPLETRTVEKKKEESQVNIRTKKPKDKESSKTE 

EKKSVKISLADCLACSGCITSAETVLVEEQSFGRVYEGIQNSKLSVVTVSPQAITSIAV 

KIGKSTNEVAKIIASFFRRLGVKYVIDSSFARKFAHSLIYEELSTTPSTSRPLLSSACPGF 

VCYAEKSHGELLIPKISKIRSPQAISGAIIKGFLAKREGLSPCDVFHAAVMPCFDKKLE 

ASREQFKVDGTDVRETDCVISTAELLEEIIKLENDEAGDVENRSEEEQWLSALSKGSV 

IGDIXKjASGGYADRTVRDFVLENGGGIVKTSKLNKNMFSTTVESEAGEILLRVAKW 

GFRNVQNLVRKMKTKKEKTDYVEIMACPGGCANGGGQIRYETMDEREEKLIKVEA 

LYEDLFRQDDEETWIKVREEWEKLDKNYRNLLFTDYRPVETNVAQVLKW 



SEQIDN062: AK081305 Mus musculus GI:26349103 

MKCEHCTRKECSKKSKTDDQENVSSDGAQPSDGASPAKESEEKGEFHKLADAKIFLS 

DCLACDSCVTVEEGVQLSQQSAKDFLHVLNLNKRCDTSKHRVLVVSVCPQSLPYFA 

AKFNLSVTDASRRLCGFLKSLGVHYVFDTTIAADFSILESQKEFVRRYHQHSEEQREL 

PMLTSACPGWVRYAERVLGRPIIPYLCTAKSPQQVMGSLVKDYFARQQNLSPEKJFH 

WVAPCYDKJa.EALREGLSTTLNGARGTDCVLTSGEIAQIMEQSDLSVKDIAVDTLF 

GDMKEYAVQRHIXjVSSDGHLAHVFRHAAKELFGEHVEEITYRALRNKDFHEVTLE 

KNGEVLLRFAAAYGFRNIQNMIQKLKKGKLPYHFVEVLACPRGCLNGRGQAQTEDG 

HTDRALLQQMEGIYSGIPVRPPESSTHVQELYQEWLEGTESPKVQEVLHTSYQSLEPC 

TDGLDIKW 



SEQIDN063: AJ419998. NeocaUimastix sp GI:21425619 

imfaeknfpdmvnnlsttkspmqmlssltkgywakdncianpkdvv>n^ain^ 

kqekdrpgmktdegdkvtdfvlttrelgmmlrqanidptklpgtkfdkvmgestg 
aavifgat 
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SEQ ID NO 64: AJ420000. Nyctotherus ovalis GI: 19572300 

5 IIFMEKNYPDMLSHLSTCKJSPQGMLGALIKGYWAKKVKKVDPKDWSVSIMPCTAK 
KAEKERPQLRGDEGFKDVDYVLTTRELAKMLKQSNIDLGKVEPTPFDAVMSEGTGA 
AVIFGVT 



10 SEQ ID NO 65: NC_003366. Clostridium perfringens GI:18311557 

MAIKDANKQYKFDTAVQVLKYEVLKMAEKEFDGTLDKEKLNIAKEIVDDLKPNVR 

CCIYKERAIVEERMKLALGGHENRENMIEVIDIACDECPWIRFIVTDACRGC^ 

DSCNFGAISFDNRKCKTOYEKCKECGKCKEVCPYNAIAEVKRFCMRACIPKALSYDV 

1 5 DSKKAVIDDSKCIQCGACWDCPFGAIMDKS YLVDVIRLLKDEKKV 

YAIVAPAISSQFNHSKIGKVITAIKKLGFEDVFEAALGADLVAVHECNEFKEKGELDF 
MTTSCCPAFVSYEKNWELKEClSKrVSPMVAMARLlKSQNKDVKTVFIGPClAKKT 
EAKRNEVSGDVDYVLTFEELLALLDSRNIKIDECEESDTKHGSFYGRLFARSGGVTES 
VKmiDSEGIKVDFRPILGDGIKDCDIKLRLAKLKRAQGNFLEGMACKGGCINGPGSL 

20 NHDIKNSKEVDKYGELSSSEKIKDTLADIKFEDLNLSKNE 



. SEQ ID NO 66: NM_012336. Homo sapiens GI:6912524 

25 

MKCEHCTRKECSKKTKTDDQENVSADAPSPAQENGEKGEFHKLADAKIFLSDCLAC 
DSCNO'AEEGVQLSQQNAKDFFRVLNLNKKCDTSKHKVLVVSVCPQSLPYFAAKFNL 
SVTDASRRLCGFLKSLGVHYWDTITAADFSILESQKEFVRRYRQHSEEERTLPMLTS 
ACPGWVRYAERVLGRPITAHLCTAKSPQQVMGSLVKDYFARQQNLSPEKIFHVIVAP 
30 CYDKKLEALQESLPPALHGSRGADCVLTSGEIAQIMEQGDLSVRDAAVDTLFGDLKE 
DK\rniHDGASSDGHLAHIFRHAAK£LlTsrEDVEE\aTRALRN^FQEVTLEKNGEVV 
LRFAAAYGFRNIQNMILKLKKGKFPFHFVEVLACAGGCLNGRGQAQTPDGHADKAL 
LRQMEGri^ADIPVRRPESSAHVQELYQEWLEGINSPKAREVLHTTYQSQERGTHSLDI 
KW 
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SEQIDN067: BC016440. Homo sapiens GI:16741189 

5 

MKCEHCTRKECSKKTKTDDQENfVSADAPSPAQENGEKCDTSKHKVLWSVCPQSLP 
WAAKFNLSVTDASRRLCGFLKSLGVHYVFDTTIAADFSILESQKEFVRRYRQHSEEE 
RTLPMLTSACPGWVRYAERVLGRPITAHLCTAKSPQQVMGSLVKDYFARQQNLSPE 
KIFHVIVAPCYDKKLEALQESLPPALHGSRGADCYLTSGEIAQIMEQGDLSVRDAAV 
1 0 DTLFGDLKEDKWRHDGASSIXjHLAHIFRHAAKELFNEDVEEVTYRALRNKDFQEV 

tlekngevv^rfaaaygfmsriqnmilklkkgkfpfhfvevlacaggclngrgqaqt 
pdghadkallrqmegiyadipvrrpessahvqelyqewleginspkarevlhttyqs 
qergthsldikw 

15 

SEQ ID NO 68: NM_031968. Homo sapiens 01:14165461 

MKCEHCTRK£CSKKTKTDDQENVSADAPSPAQENGEKGEFHKLADAKIFLSDCLAC 
20 DSCMTAEEGVQLSQQNAKDFFRVLNLNKKCDTSKHKVLWSVCPQSLPYFAAKFNL 
SVTDASRRLCGFLKSLGVHYVFDTTIAADFSILESQKEFVRRYRQHSEEERTLPMLTS 

acpgwvryaervlgrpitahlctakspqqvmgslvkdyfarqqnlspekifhvrvap 
cydkklealqeslppalhgsrgadcvltseisqawwctpvitatreaaareslepgr 
qrlqrdkiapldsslggg<^iaqimeqgdlsvrdaavdtlfgdlkedkvtrhdgass 

25 IXjHLAHIFRHAAKELFNEDVEEVITRAUWKDFQEVTLEKNGEWLRFA>^ 

IQNMILKLKKGKFPFHFVEVLACAGGCLNGRGQAQTPDGHADKALLRQMEGIYADI 
PVRRPESSAHVQELYQEWLEGINSPKAREVLmTYQSQERGTHSLDIKW 

30 SEQ ID NO 69: AE015936. Clostridium tetani GI:28202388 

MHNDYREIFKRLSKSYYDDTFEKEVENILSSHSMDREKLAKIISILCGVNIEHSENYIS 
NLKNAIKNYTASAEKVVTKLPCSTQCAKDGDIICEKSCPVNAIFRDPNDNNIYINDEL 
CLDCGLCVRNCPSGSILDKKEFIPLAELLKSESrVIAAVAPAIMGQFGENTTINQLRTAF 
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KKLGFTDMVEVAFFADMLTLKEAVEYDHFVKDEQDFMITSCCCPMWVCMLKKVY 
TSIDLVKYVSPSVSPMIAAGRVLKLLNPNCKVVFVGPCIAlOCAEAREKDLLCHJroFVLT 
FTELRDIFDWDIQPENLEEDFSSEYASKGGIU.YARTGGVSIAVSEAffiKLFPlS[KYKFL 
KTIQADGVKGCKSLLDKIKQEDISANFVEGMGCVGGCVGGPKVIIDPSEGRNAVNNF 
5 AENSSIKVSVDSNCMNDILSKININSVEDFKDKDKISIFEREFK 



SEQ ID NO 70: NZ_AABIO 1000028. Desulfovibrio desulfuricans 01:23475858 

10 

MWRTYDNTINFEIMVRIAKAFHGDSFEEQVARffLEMRPRKAHSSRCCIYR^ 
RCMAMLGYAIEDETDELTSLSQYAKGALERDSIQGSMLTFIDEACNGCVRTHYEATS 
ACRGCLAEACVQHCPKDAVRTVDGKSRIDPDKCVQCGKCMNVCPYHAIVQIPIPCEE 
SCPTGAISKDECGKQVTOYDRCIFCGKCMAACPFAAVLEKSQMIDVLRRIREGRKW 
1 5 ATVAPAIAGQVQAPMSRLATALRQLGFADVAEVASGADTTARLEADEFVERMEHGA 
AFMTSSCCPAYTQLVDKHLPEIAPFVSDTRTPlSfflYTAAMVKDHDPDMVTVFIGPCV 
AKRNEGKHDELVDHVLTFQEWAMLTAAGISVDACEDGRFMFPAMREGRSFPVSG 
GVTAGVQAHIGTRAEVRPLSVDGLNKKTFRQLKTWAKKGCEGNFYEVMGCQGGCV 
AGPATVM 

20 

SEQ ID NO 71: AE015944. Clostridium tetani GI:28204446 

MIVFENQLKKLKYLVLKEVAKMTLEDRLGEEDIERISFDIIKGDKAEYRCCVYKERAI 
25 VYERAKLATGCLPNGQVAEEFVHVEDDDQnyvroAACDKCPINKYVVTEACRGCLQ 
HKCMEVCPAGSINRAAGKAYINHETCKECGLCESACPYNAIAEVMRPCRRACPTGA 
LQMNI^DNKATINKEDCINCGSCMSVCPFGAISDKSYWDrnCALKNNKKVYAMVAP 
ATTGQFGKDVSVGKMKNAFKAMGFEDMLEVACGADAVAAHESEEFIERLESGKKY 
MTTSCCPGFLGYIEKKFPDQLENVSNTVSPMVAIGRMIKXEYEDSVVVFVGPCTAKK 
30 AEKRKGIKDAVDYVlVm^EElAALMGAFEIDPAECEEEDINDGSNYGRGFAQGGGVV 
SAIQNCIKDKEGIKFNPLRVSGPDQIKRAMIMAKVGKLSENFIEGMMCEGGCIGGPAT 
MVSAVKAKAPLMKFSKSSTIKDVKDNEVLDKYKDINMER 
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SEQ ID NO 72: AKl 17970. Arabidopsis Ifaaliana GI:264S1001 

MDLIKLKGVDFKDLEESPLDRVLTNVreEGDLYGVAGSSGGYAETIFRHAAKALFGQ 
5 TIEGPLEFKTLRNSDFREVTLQLEGKTVLKFALCYGFQNLQNIVRRVKTRKCDYQYY 
EIMACPAGCLNGGGQIOKTGQSQKELIHSLEATYMNDTTLNTDPYQNPTAKRLFEE 
WLKEPGSNEAKKYLHTQYHPWKSVTSQLNNW 



10 SEQ ID NO 73: NC_003366. Clostridium perfringens GI:18309258 



MNKKYNSLFKELISSYYSEDNFDEKLNDIVKNNFNSKEDAIEVLSSLCGVDIDKNSDN 
lAYDIWCATITHKIKKNIVDKVSVCTKNCSKESKGKCQSLCPFDAILTDPIDNSKYIDP 

1 5 NLGQNCGICVQVCESGHFUJRIELLPiroLIKNl^TVaAAVAPAIAGQFG 

EAFIKIGFSDMffiVAFAADMLSIKEAyEFNHHVEKTGDILITSCCCPMWVAMLRKCY 
KDLVKDVSPSVSPMIAAGRVIKKLNKDAKWFIGPCIAKKAEAREKDLVGAIDYVLT 
FEELNGIFEALKIDPSSMKGVPSIEYTSRGGRLYARTGGVSEAINDVVKELYPDKAKIF 
KAVQANGVKECKELLNKVQSGELKANFIEGMGCVGGCVGGPKJUVDPSIGKKHVDE 

20 VAYNSPIKVATHSHTMDEVLLRLGINSLKSFEDKEKISIFEREF 



SEQ ID NO 74: NZ_AABB01 00291 1 . Desulfitobacterium hafiiiense GI:23120640 

25 

MAQSEIMKIRRQVLKSALDWVSHDQNRKDRATLARQIIPDGTPRYRCCIHKERAVIE 
ERLKA\a.EPDEGPIVRVLKEGCNGCEMHRYSVTDHCQNCVGHFCFTNCPKKAILFIN 
NKAFroQTRCVECGLCARNCPYHAIIEYRRPCEDSCPTKAISVREDRIASIAEAHCTSC 
GKCIISCPFGAVAESSQLIULFEAVRNPEHKIYAVIAPAFVGQFGRKVSPGQVKSALLJC 
30 LGFQDVLEAALGADRTIELEAREYDERLAHGEEFMTSSCCPAYVSAVIKEKPDLFHHI 
SSTLSPMAQVAHILKEKDPEAKLAFIGPCVAKJKEEGKRPETKVDFVLTFEELMVAVLD 
YAGINPAEESEQ 
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SEQroN0 75: NZ_AAAS01000012 Geobactermetallireducens...gi:23055369 

MCHWLHREAGLVYDPAVDQAINRVSGLTLSAGRTMEPIITVKEKCRKCYCCVRSCP 
VKAIKVAKSYTEIIVDRCIGCGNCLSNCPQQAKIVIVADKVGVTEKLLSSGEEVIAVLG 
5 SSFPAFFHNVTPGQLVAGLRKIGFAEVHEGSYGAELIADDYARITSEKGHPRISSHCPA 
rVDLffiRHYPKLVGNLVPWSPMVAMGRYLKGTLGQHVRVVYISSCVANKLETQTQ 
ETRGAVDIVLTYRELEGIFRSRQIALPALADEPLDGIRPGAGRLFPIA 
DGTFRAFGIPFDPLDTEIVAACGEVNVMGIINDLAAGRISPRIADLRFCYDGCIGGPGR 
NRALTEFYRRNRVIAUFKQEVPCRTVPNSLLEAGRVSFGRSFASKYAKLEAPKANDV 

1 0 RKILNATNKFTVKDELNCRACGYRTCREYAVAVFQGLAEIEMCLPYNLQQLEEDRG 
RLIQKYELARRELEREYGDEFIVGNDRKTLDVLGLIKQVGPTPTrVLIRGESGTGKEL 
TARAIHRYSKRNDKPLVrVNCTTITDSLI^SELFGHKRGAFTGAVADKKGLFEAADG 
GTIFLDEIGDITPKLQAELLRVLDMGE\^VGGTAAKKVDVRLIAATNKM.EQGV^ 
GWFREDLYYRLNVFITmPPLRSRVESIPILVHHFMDKASTKLNKRMVGffiDRA\^ 

1 5 LmVPWPGNIREMQNVIERAAVLTHDGVIRVENFPLALSEGLEEGFATGLDIHAASFR 
SEREQHMGKLEKKLIQRYLTEANGMSRAAKLANIPRRTFYRLLDKYRLRERDVR 



20 

SEQIDN0 76: AE007819 Clostridium acetobutylicum GI: 15026303 

MNNKYIELFKSLVDSYYNDTFDSFVYHILSDEEVDKKELSKVISSLCGVSVEFKDTET 
YISELKKAIS^^¥^CCTDN^^^KIKECDSSGHSNEGETPCQK5CPFDAILVDK^rIX 
25 KDLCTIX:GNCITSCPSGSILDKffiFMPLLNLFK]S[NETVIAAVAPALAGQFGENVSL^ 

RTAFKKVGFADMVEVAFFADMLTIKEAFEFNELVNSKDDLMITSCCCPMWVSMIRKI 

YKDLARHVSPSVSPMIASGRVIKKLNPNCKVVFIGPCIAKKAESRSQDISDATOFVLTF 

EELKGIFDVLDIDPEKLPETHTKSYASREGRLYGRTGGVSTS 

VDEAVKRIFPNKHHLFKSTK\a)GVKDCKDIL>fKTQAGMGANFLEGMGCVGGCVGG 
30 PKAI\T^KDQGRESWnCTAESSEIKISVDSERMKDILSRIGINSffiDFGDKSKVDIFERRF 



SEQroN0 77: NC_004347 Shewanella oneidoisis GI:24375409 
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MNKKKHLFAEDSFFLSRRia?MAVGAAFVAALAIPIGWFrSKLERRNEYIKARS 
KDDSLAKTRVSHANPAVEKYYKEFGGEPLGHMSHELUfTHFVDRTKLSS 

5 

SEQ ID NO 78: AY172%3. Entamoeba histolytica 01:27652439 

MSTQLTPLRNKIISEWKCFKSGRFIEDIDKLPTILTDGDGWKPTSKFVHSREQEEGIY 
REKVLSVLGFVDGEYDDITPLHVYAQKALERTSLHEPVFGISQKGCNKCHFNGYFVT 

1 0 QACEGCTSRPCSVNCPKKCISFGEDGRAVINQNNCIKCGRCYKFCPYGAnSKSVPCV 
KACPCGAMUJSPEGVKTroPEKCINCGGCMRACPFGAILPRSNLroVLKILPTKKVVA 
CPAPSIAAHFGKYDLALVSGGLIQVGFTSVEDVSYGADLCALNEAKEFEERIVKNKK 
DFMTOCCPAYINAINKHMPELKE^WSHITTPMHFATQAVKDRDQETVTVFIGPCNA 
KRWETLQDSTTDYCLTFDEIFGLFEGSGIDLSKVQPYTFVDKAHKEGKIFAVSGGVAS 

1 5 AVASLLPKEVPDGVIKPTiroGFSQENFKRLKNFiaaSlTrGNLVEYMVCECXJCAYGPG 
CPGLNTPATSAKIKIAVDKMEAHPEGRWYGLPNSQKPIKVEN 



20 SEQ ID NO 79: AY145118. Cryptosporidium parvum GI:23213133 

MFSTAVKLANLDDYLESSQDCWSLLSDKDDTKPKIAVMRPAKAQGNKDDKKSGTS 
DKATVNVADCLACSGCVTSAEAKLLEDQNVSEFNINILKQKRLTVVSISNQSCSSFAC 
HLNCDLrriQRKLSGLFKHIGARFVMNSTISEYISLLETKYEFISRYKAKSDLPMnSHCP 
25 GWIC¥^SEKSlJNrSSVU>LLSK\^A(K^LQGILIKTXTLEryNQLLFLYKFRLSNSYRTl^ 
NVKSTFTQNDNFVEQSDIFHVAIMPCHDKKLESTRSSLSLKSS 

DKNSSCPEVDIVLATSEVGEIIKLAGFNSLLDVPEAPLDNLWLNQNPQITKKHNLSLLI 
TENYVSNQILNQFSWLIPSYFNSNSGGFCEYnRSAKELAGDHIDNKVQLPFNKLKND 
n.EAKYIK]SINVELNYCLAYGFRAIQSISRKLNLQKNASQNTQYKQSVVNHVNYHLIE 
30 AMACPTGCVSGGGQILSQNDQNDDNSDLNKLRKMKFIDEVQEALYKGINLNKNQE 
VILPDEIPIVNILYEYLIHIDKQIDRSSGLKLPFLRNDFVSINEVPTASSLKW 



SEQ ID NO 80: X70373. Kluyveromyces lactis GI:5538 
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MSALLRDADLNDFISPGLACVKPAQPQKVEKKPSFEVEVGIESSEPEKVSISLQDCLA 
CAGCITSSEEILLSKQSHKVFLEKWSELEELDERSLAVSISPQCRLSLADYYSMCLADL 
DRCFQNFMKTKFNAKYVVGTQFGRSISISRINATLKDRVPENEGPLLCSVCPGFVLYA 
5 EKTKPELIPHMLDVKSPQQITGNLLKQADPTCYHLSIMPCFDKKLEASREECEKEVDC 
VITPKQFVAMLGDLSIDFKSYMTEYDSSKELCPSGWDYKLHWLSNEGSSSGGYAYQ 
YLLSLQSSNPESDIITIEGKNSDVTEYRLVSKSKGVIASSSEVYGFRNIQNLVRKLSQSA 
SWKRGIKVKRRGQSVLK^GETSEKTTKVLTADPAKTDFVEVMACFSGCINGGGLLN 
EEKNANRRKQLAQDLSU^YTKVHSWIPDIVHAYDDKSNDFKYNLRVIEPSTSSDVV 
10 AVGNTW 



SEQ ID NO 81: NC_003232. Encephalitozoon cuniculi gi:17158048 

15 

MDALIRPPMSFFADLPKDNKKCIKIGSPLALSLSDCLACSGCVSADEAGALSEDLSFV 

ldlspqtsfvlspqskinifnlyredgmeyrefeavlssflrskfnihrrvdtsylrski 
yeetyreymatnhlrvsacpgvvtyiertapyligylsrvkspqqmafslvkgsrtv 
svmpcqdkklengrdgvkfdfilttrgfckaldslgfrrparasgkslcsmeeaett 
20 qwnigtssggyaefilgkhcwetreirngikehllddgrtisqitglensinyfkssk 
tkgprhkmteiflckngciggpgqervndvemdireydrngreqprifysspgleekr 
vfrevkakrvdlrvdw 



25 SEQ ID NO 82: AF312932 Tritrichomonas foetus GI:1 41 64621 

mcikacnsvagqgvlklvkvgnkklvstksgkplqetncikcgqctlvcgpgaltq 
kdaiqtvsevlknpgdkvlvcqtapairxnlaikjlgmpagsirrgkmvtalkmlgf 

KYVFDTNFGTDXnG 



SEQ ID NO 83: AJ271546 Scenedesmus obUquus GI:1331 1 187 
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MPEWQPGGRYAVSVW>PVNRRAWAAERRia,VVRAAGPTAECIX:PPAPAPKAP 
QQTLDELAKPKEQWCVMIAQIAPAVRVAIAEmGIJ^PGDVTVGQMVTGLl^ 
YVFiyrLFGADLTIMEEGTELRHW.QDHLEQHPNiaEEPLPMITSCCPGWVAMVEKSN 
PELIPYLSSCKSPQMMLGAVIKNWAAEAGAKPEDICNVSVMPCVRKQGEADREWF 
5 NTTGAGGANVDHVMTTAELGKJFVERGIKLNDLQESPFDNPVGEGSGGGVLFGTTG 
GVMEAALRTVYEVVTQKPLDRIVFEDVRGLEGIKESTLHLTPGPTSPFKAFAGADGT 
GITLNIAVANGLGNAKKLIKQLAAGESKYDFIEVMACPGGCIGGGGQPRSADKQILQ 
KRQAAMYDLDERAVIRRSHENPLIGALYEKFLGEPNGHKAHELLHTHYVAGGVPDE 

10 

SEQIDN0 84: AAAB01008984 Anopheles gambiaeGI:213026S9 

SRFSSALQLTDLDDFITPSQECIKPVKIETSKSKTGAKrnQEDGSYVQESSSGIQKLEK 
1 5 VEITLADCLACSGCITSAEGVLISQQSQEELLRVMNANNLAIOJWQRDEIKFVWTVS 
QQPILSLARKYNLTPEDTFEHUGYFKiaXjADMVVDTKIADDLALIECI»ffiFffiRYNT 
NWCLLPMLASSCPGWCYAEKTHGNFILPYIATTRSPQQIMGVLVKQYLAKQLQTTG 
DRIYHVTVMPCYDKKLEASREDFFSEVENSRDVDCVITSIEIEQMLNSLDLPSLQLVE 
RCAIDWPWPTVRPSAFVWGHESSGSGGYAEYIFKYAARKLFNVQLDTVAFKPLRNN 
20 DMREAVLEQNGQVLMRFAIANGFRMQNMVQKLKRGKSTYDYVEIMACPSGCLNG 
GAQIRPEEGRAARELTAELECMYRSLPQSTPENDCVQTMYATFFDSEGDLNKRQSLL 
HTSYHQIEKINSALNIKW 



25 SEQIDN0 85: NC_004347 ShewaneUaoneidensisGI:24375408 

NITTITYQPCMQGLIKINASKCKGCDACKQFCPTHAINGASGAVHSIDEDKCLSCGQC 
LINCPFSAffiETHSAI^TVIKKLADKNITWGIIAPAVRVAIGEEFGLGTGELVTGKLY 
GAMNQAGFKIFDCNFAADLTIMEEGSEFIHRLHANVKGEANAGPLPQFTSCCPGAWR 
30 YLETTlYPALLP>n:.STAKSPQQMAGTVAKTYGAKWQMQPENIFWSVMPCTSKKLE 
ASRPEFTsfSAWQYHQEHGANSPSYQDIDAVLTTREMAQLLKLLDIDLANTAEYQGDS 
LFSEYTGAGTIFGTTGGVMEAALRTAHKVLTGTEMAKLEFEPVRGLKGYKSASVSLF 
DTELNQDVTVNVAVVHDMGNNIEPVLRDVMAGTSPYHFIEVMNCAGGCVNGGGQP 
lEGKGSSWLGNI 
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SEQIDN086: NZ_AABG02000026 Clostridium thermocellum GI:23021 197 

5 

MAR^WR^^VRSRPFPKKPNGRGCEKMQMVN\^^DNCKIQWANYTVLEAAK^ 
PTLCFLKDINEVGACRMCVVEVKGARSLQAACVYPVSEGLEVYTQTPAVREARKVT 
LELILSNHEKKCLTCVRSENCELQRLAKDLNVKDIRFEGEMSNLPIDDLSPSWRDPN 
KCVLCRRCVSMCKI^QTVGAIDVTERGFRTTVSTAFNKPLSEVPCVNCGQCINVCPV 

1 0 GALREKDDroKVWEALANPELHVVVQTAPAVRVALGEEFGMPIGSRVTGKMVAAL 
SRLGFKKVFDTDTAADLTIMEEGTELINRIKNGGKLPLITSCSPGWIKFCEHNYPEFLD 
NLSSCKSPHEMFGAVLKSYYAQKNGroPSKVFWSIMPCTAKKFEAQRPELSSTGYT 
DVDWLTTRELARMIKETGIDFNSLPDKQFDDPMGEASGAGVIFGATGGVMEAAIRT 
VGELLSGKPADKIEYTEVRGLDGIKEASIELDGFTLKAAVAHGLGNARKLLDKIKAG 

1 5 EADYHFIEIMACPGGCINGGGQPIQPSSVRNWKDIRCERAKAIYEEDESLPIRKSHENP 
KIKMLYEEFFGEPGSHKAHELLHTHYEKRENYPVK 



20 SEQ ID ISO 87: NZ_AABB01 000275 Desulfitobacterium hafhiense GI:23 1 1 3285 

IVmvIGQLRAALKHLGFYGMlEVALFADVLSLKEALEFDKHVQTDKDFVLTSCCCPIW 
VGM\^VYDlXVPfflSPSVSPMVACGRGIKJa.HPDAKTWIGPCIAKKAEAKEPDIRD 
A\a5AVLTFHELKQIFEATDIEPSEMEDIPSEHSSTSGRlYARTGGVSKSISDTLNRIRPD 
25 KPVKIKSIQANGIKECKALLNDIMNNEIKANFYEGMGCPGGCVGGPKAWDV^ 
FVNKYGAEADALTPADNQHVLELLKQLGIDSVEELLGGESAAIFQRDF 

SEQ ID NO 88: AAR04931. C. rdnhardtii iron-I^drogenase gi:3791 1261 

30 MALGLRAELRAGQAVACARRTNAPAHPAAWPVLPSRGDKFFNLSQKVPSSQPARG 

STIRVAATATDAVPHWKLALEELDKPKDGGRKVLIAQVAPAVRVAIAESFGLAPGA 
VSPGKLAAGLRALGFDQVFDTLFAADLTIMEEGTELLHRLKEHLEAHPHSDEPLPMF 
TSCCPGWVAMMEKSYPELIPFVSSCKSPQMMMGAMVKTYLSEKQGIPAKDIVMVSV 
MPCVRKQGEADREWFCVSEPGVRDVDHVITTAELGNIFKERGIILPELPDSDWDQPL 
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GLCBGAGWGTTGGVMEAAWTAYEIVTTtEPLPRLl^SEVRGLlXjIKEASVTLVPA 
PGSKFAELVAARLAHKVEEAAAAEAAAAVEGAVKPPIAYDGGQGFSTDDGKGGLK 
LRVAVANGIXjNAKKLIGKMVSGEAKYDFVEIMACPAGCVGGGGQPRSTDKQITQK 
RQAALYDLDERlSrn.RRSHENEAVNQLYKEFLGEPLSHRAHELLHTHYVPGGAEADA 

5 



SEQIDN0 89: gi:37911254 AAR04930 C. reinhardtii iron-hydrogenase HydA2 

1 0 MALGLRAELRAGQAVACARRTNAPAHPAAVVP VLPSRGDKFFNLSQKVPSSQP ARG 
STIRVAATATDAVPHWKLALEELDKPKDGGRKVLIAQVAPAVRVAIAESFGLAPGA 
VSPGKLAACaRALGFDQWDTLFAADLTIMEEGTELLHRLKEHLEAHPHSDEPLPMF 
TSCCPGWVAMMEKSYPELIPFVSSCKSPQMMMGAMVKTYIiJEKQGIPAKDIVMVSV 
MPCVRKQGVADREWFCVSEPGVRDVDHVITTAELGNIFKERGIILPELPDSDWDQPL 

15 GLGSGAG\aJGTTGGVMEAAVRTAYEIVTKEPLPRLNLSEVRGLDGIKEASVTLVPA 
PGSKFAELVAARLAHKVEEAAAAEAAAAVEGAVKPPIAYDGGQGFSTDDGKGGLK 
LRVAVANGLGNAKKLIGKMVSGEAKYDFVEIMACPAGCVGGGGQPRSTDKQITQK 
RQAALYDLDERNTLRRSHENEAVNQLYKEFLGEPLSHRAHELLHTHYVPGGAEADA 

20 SEQ ID NO 90: gi: 1 5642974 T. maritima 

MRRFFKNNLRNLSQNGETNSVRRCFALADVTVYINGRTLTVPDNLTVIEACEKAGIEI 

PALCHHPRLGESIGACRVCVVEVEGAKNLQPACVTKVRDGMVIKTSSDRVKTARKF 

NLALLLSEHPNDCMTCEANGRCEFQDLIYKYDVEPIFGYGTKEGLVDRSSPArVRDLS 

25 KCKCQRCVRACSELQGNOBYSMVERGHRTYPGTPFDMPVYETDCIGCGQCAAFCPT 
GAIVENSAVKWLEELEKKEKILWQTAPSVRVAIGEEFGYAPGTISTGQMVAALRR 
LGFDYVFDTNFGADLTIMEEGSEFLERLEKGDLEDLPMFTSCCPGWVNLVEKVYPEL 
RTRI^SAKSPQGMLSAMVKTYFAEKLGVKPEDIFHVSIMPCTAKKDEALRKQLMVN 
GVPAVDVYLTTRELGKLIRMKKIPFANLPEEEYDAPLGISTGAAALFGVTGGVMEAA 

30 LRTAYELKTGKALPKIVFEEVRGLKGVREAEIDLDGKKIRIAVYHGTANVRNLVEKIL 
RREVKYHFVEVMACPGGCIGGGGQPYSRDPEILRKRAEAIYTIDERMTLRKSHENPAI 
KKLYEEYLEHPLSHKAHELLinYYEDRSRKKRLAVK 
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SEQ IDNO 91: C. reinhardtii gi:9837540 

MSALVLKPCAAVSIRGSSCRARQVAPRAPLAASTVRVALATLEAPARRLGNVACAA 
AAPAAEAPLSHVQQAIAELAKPKDDPTRKHVCVQVAPAVRVAIAETLGLAPGATTP 

5 KQLAEGLRRLGFDEVFDTLFGADLTIMEEGSELLHRLTEHLEAHPHSDEPLPMFTSCC 
PGWIAMLEKSYPDLIPYVSSCK^PQMMLAAMVKSYLAEKKGIAPKDVmvlVSDVIPCT 
RKQSEADRDWCVDADPTLRQLDHVITTVELGNIFKERGINLAELPEGEWDNPMGV 
GSGAGVLFGTTGGVMEAALRTAYELFTGTPLPRLSLSEVRGMDGIK^INITMVPAPG 
SKFEELLKHRAAARAEAAAHGTPGPLAWDGGAGFTSEDGRGGITLRVAVANGLGN 

10 AKKLITKMQAGEAKYDFVEIMACPAGCVGGGGQPRSTDKArrQKRQAALYNLDEKS 
TLRRSHENPSmELYDTYLGEPLGHKAHELLHTHYVAGGVEEKDEKK 



1 5 SEQ ID NO 92: T. tencongensis NP_622546. 1 

MDKVRVrroGITVEVFSYYTVLEAAKEAGIDIPTLCYLKEINQIGACRICLVEIEGVKN 
LQTSCTYPWDGMKVYThm»KIREARRLNLELILSNHDRNCLTCVRSTNCELQALAK 
RLGVEEIRFEGENIKYPIDDASPAVVRDPNKCVLCRRCVAVCSEVQNVFAIGMVNRG 

20 FKTMVAPSFGRSLKDSPCISCGQCIMVCPVGAIYEKDHTKRVYEALADDKKYVVAQ 
TAPAVRVALGEEFGMPVGTIVTGKMAAALRRMGFDAVFDTNFAADLTIMEEGSELL 
ERIKHGGKLPMITSCSPGWIAFCEKYYPEFIDNLSTCKSPHMMMGALVKSYYAEKKG 
LDPKDIFVVSIMPCTAKKLEffiREEMIRNGMKDVDAVLTTRELARMIKEMGIDFVNL 
KDEEFDEPLGMSTGAGAEFGATGGVMEAALRTVAEIVEGRDIGKTOFEEVRGLEGVR 

25 EATITTDGNTOIKIAIANGTGNAKKLLPKVKAGEVEYHFffiVMGCPGGCIMGGGQPIH 
NPNENffiEViaa-RAKAIYEroKlUPIRKSHENPAIKRLYEEFLGYPLSEKSHELLHI^ 
SRKELYPLVK 



30 SEQ ID NO 93: N. frontalis AAK60409.1 

MSMLSSVLNKAVVNPKLTRSLATAAAEKMVNISINGRKFQVKPKTTVLEAAKANGY 

YIPTLCYHQELPVAGNCRLCLVYAKGSWKPLTACTTEVWEGMEIETDSFAVIETVRS 

SLSMMREEHPNDCMTCGSNGDCEFQDLmiYQIDAKHPVRSLLKHKSKKTNHSITEP 
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CYSPFDimTSVARDMNKCVKCGRCIRACHHFQNimCfflNRACfYERVGTPMDRPM 
NFTKCVECGQCSQVCPVGAITARTEVVDVLRHLDTKKKVVVCSTAPAIRVAPAEEFS 
TEADFDFTGKMVAGUlKLGFDYIFDTNFSADLTIMEEGTELroRLlWGGKFPM 
CPGWINMVEKSYPELSDl^SSCKSPQQMIGAVIKSYFAKKLGLSTEDIIHVSIMPCTA 
5 KKGEARRPEFVQKGKDGKDYPDroYVITTRELLTLLKLKKINPAELPDDKFDSPLGIG 
SSAGNLFGVTGGVMEAAIRTAQVTTGVENPIPLGELKAIRGLDGIKAANVPLKTKDG 
KEVSVRAAWSGGANIQKFLEKIKNKELEFDFIEMMMCPGGCINGGGQPKSADPEIV 
AKmQRJVrmiDDQAKLRLCHENPEnDWKNFLGEPNSHLAHELLHTHYNDRSKTI 
HDMGHHEKK 

10 

SEQIDN0 94: AAD33071.1 C. thermocellum gi:4927278 

MVlsrVTIDNCKIQVPANYTVLEAAKQANIDIPTLCFLKDINEVGACRMCVVEVKGARS 
15 LQAACVYPVSEGLEVYTQTPAVREARKVTLELILSNHEKKCLTCVRSENCELQRLAK 
DLNVIOJmFEGEMSNLProDLSPSVVRDPNKCVLCRRCVSMCKNVQTVGAIDVTERG 
FRTTVSTAFNKPLSEVPCVNCGQCINVCPVGALREKDDIDKVWEALANPELHVVVQT 
APAVRVALGEEFGMPIGSRVTGKMYAALSRLGFKKVFDTDTAADLTIMEEGTELINR 
IKNGGKLPLITSCSPGWIKFCEHNYPEFLDNLSSCKSPHEMFGAVLKSYYAQKNGIDP 
20 SKVFVGSIMPCTAKKFEAQRPELSSTGYPDVDVVLTTRELARMIKETGIDFNSLPDKQ 
FDDPMGEASGAGVIFGATGGVMEAAIRTVGELLSGKPADKIEYTEVRGLDGIKEASIE 
LDGFTLKAAVAHGLGNARKLLDKIKAGEADYHFIEIMACPGGCINGGGQPIQPSSVR 
NWKDJRCERAKArV'EEDESLPIRKSHENPKnCMLYEEFFGEPGSHKAHELLHTHYEKR 
ENYPV 

25 

SEQIDN0 95: B. thetaoimicron gi:29345534 NP_809037.1 

MEEKQrrLQIDGHFTTVPEGSTILEAACKIGINIPTLCHIDLKGTClKNNPASCRICVVEV 
30 AGRRNLAPACATRCTEGMYVKTSTLRVMNARKWAELILSDHPNDCLTCPKCGNCE 
LQTLALRFNIREMPFNGGELSPRKREVTSSrVRNMDKCIFCRRCESVCNDVQTVGALG 
AIRRGFNTTIAPAFDRMMKDSECTYCGQCVAVCPVGALTERDYTNRLLDDLADPDKI 
VIVQTAPAVRAALGEEFGLPPGTLVTGKMVYALRELGFDYVFDTDFAADLTIMEEGS 
EILNRLTRYLDGDKSVRLPILTSCCPAWVNFFEHHFPDMLDIPSTARSPQQMFGSIAKS 
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WAEmGIPREia.VWSIMPCLAKKYECDRDEFKVNGVPDVDYSISTRELARLIKRA 
MGITLVIJ>SPFD]^MGESTGAGVIFGTTGG\MAAIJlSVYEIYTCilQPLia^ 
RGLSGVRRATTOLNGreUCVGIAHGLGNARHLLEDIRNGHNEYHVIEIMACPGGCIGG 
GGQPLHHGNSDVLYARANALYREDANKPLRKSHDNPYIQKLYEEYLGKPLGEKSEM 
5 LLHTHYFNKSID 



SEQIDN0 96: gi:2126449 D57150 D. fhictosovorans 

10 

MSMLTrnDGKTTSVPEGSmDAAKTLDroiPllCYLNLEALSINNKAASCRVCVV 
EGRR]^APSCATPVTDNMVVKTNSLRVLNARRTVLELLLSDHPKDCLVCAKSGECE 
LQTLAERFGIRESPYDGGEMSHYRKDISASnRDMDKCIMCRRCETMCNTVQTCGVLS 
G\asiRGFrAVVAPAFEMNLADTVCTNCGQCVAVCPTGALVEHEYIWEVVEALANPD 

1 5 KWIVQTAPAVRAALGEDLGVAPGTS VTGKMAAALRRLGFDHVFDTDFAADLTIME 
EGSEFLDRLGKHLAGDTNVKLPILTSCCPGWVKFFEHQFPDMLDVPSTAKSPQQMFG 
AIAKTYYADLLGIPREKLVWSVMPCLAKKYECARPEFSVNGNPDVDIVITTRELAKL 
VKRMNIDFAGLPDEDFDAPLGASTGAAPIFGVTGGVIEAALRTAYELATGETLKKVD 
FEDVRGMDGVKKAKVKVGDNELVIGVAHGLGNARELLKPCGAGETFHAIEVMACP 

20 GGCIGGGGQPYHHGDVELLKKRTQVLYAEDAGKPLRKSHENPYIIELYEKFLGKPLS 
ERSHQLLHTHYFKRQRL 

SEQ ID NO 97: D. vulgaris gi:97381 S13526 

25 MNAFINGKEVRCEPGRTILEAARENGHFIPTLCELADIGHAPGTCRVCLVEIWRDKEA 
GPQIVTSCTTPVEEGMRIFTRTPEVRRMQRLQVELLLADHDHDCAACARHGDCELQ 
DVAQFVGLTGTRHHFPDYARSRTRDVSSPSWRDMGKCIRCLRCVAVCRNVQGVD 
ALWTGNGIGTEIGLRHNRSQSASDCVGCGQCTLVCPVGALAGRDDVERVIDYLYDP 
EIVTVFQFAPAVRVGLGEEFGLPPGSSVEGQVPTALRLLGADVVLDTNFAADLVIME 

30 EGTELLQRLRGGAKLPLFTSCCPGWVNFAEKHLPDILPHVSTTRSPQQCLGALAKTY 
LARTMNVAPERMRVVSLMPCTAKKEEAARPEFRRDGVRDVDAVLTTREFARLLRRE 
GIDLAGLEPSPCDDPLMGRATGAAVIFGTTGGVMEAALRTVYHVLNGKELAPVELH 
ALRGYENVREAVVPLGEGNGSVKVAVVHGLKAARQMVEAVLAGKADHVFVEVM 
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ACPCKK:MDCXjGQPRSKRAYNPNAQARRAALFSLDAENALRQSHNNPLIGKVYESFL 
GEPCSNLSHRLLHITIYCH5RKSEVAYTMRDIWHEMTLGRRVRGDSD 



5 

SEQ ID NO 98: T. vaginalis gi: 1 1 127701 AAG3 1037. 1 

ASTGINSTANILRNITVT\n^GKPLEAKKGET\a.ELCDRNNIRIPRLCFHPNLPPKASCR 
1 0 VCLVECDGKWLSPAC VITVWDGLKroTK5KNVRDS\^NNLKELLDCHDETCSACIA 
NHRCQFRDMNVAYSVKAETKEICSEEGIDESTNAIRLDTSKCVLCGRCIRACEEVAGT 
SAIIFGNRAKKMRIQPTFG\ai.QETSCIKCGQCTLYCPVGAITEKSQVKEALDILANK6 
KKrrWQVAPAVRVALSEAFGYKEGTNTITGKMVSALKALGFDLVYDTOYGADLTIC 
EEAGELVNRLRDPNAKFPMFTTCCPAWVNYVEQSAPDFIPNLSSCRSPQGMLSALIK 
1 5 NYLPKLLDVKQEDVLNFSIMPCTAKKDEVERPELRTKSGLKETDMVLTTRELVEMIK 
LSNIDFNNLPDTQFDNIFGFGSGAGQIFAATGGVMEAASRTAFEVYTGKKLTNVNIYP 
VRGMDGLMAELDLDGTKLK^VAVCHGIANTAKLLDRLREKDPELMDIKFIEIMACPG 
GCVCGGGTPQPKNRVSLDNRLAATWIDAKMECRKSHENPLIKGVYKEFLGKPNSHL 
AHELLHTHFKHHPKW 

20 

SEQ ID NO 99: gi:7522122 T18557 

MISmAKKAPLFLRTFATSEMISLKTOGKIISWKGIMLADAIKKAGANVPTMCYHPD 
LPTSGGICRVCLVESAKSPGYPIISCRTPVEEGMEIVTQGSKMKEYRQANLALMLSRH 
PNACLSCTSNTNCKTQELSANMMGQCGFANATPPKNDDSYDMTTAIERDNDKCINC 

25 DICVHrCSLQGLNALGFYNEECMAVKSMGTLDVSECIQCGQCINRCPTGAITEKSEIR 
PVLDAINIQQRLVFQMAPSIRVAVAEEFGIKPGEKILKNEIATALRKLGSNVFVLDTNF 
SADLTIffiEGHELffiRLYMSrVTGiaa.LGGDHMProLPMLTSCCPGWIMFIEKNYPDLL 
NNI^TCKSPQGNfLGALIKGYWAKNIKmDPKDIVSVSIMPCTAKKAEKERPQLRGD 
EGYKDVDYILTreELAKMLKQSNTOLAKMEFTPFDKVMSEGTGAAVIFGWGGVME 

30 AALRTANEVITGREVPFKNLNIEAVRGMEGIREAGIKLENVLDKYKAFEGVTVKVAI 
AHGPNNARKVMDIIKQAKESGKPAPWHFVEVMACPGGCIGGGGQPKPTNLEIRQAR 
TQLTFKEDMDLPLRKSHDNPEIKArVENYLKEPLGHNSHHYLHTTYSSQKVRDMNLY 
NANEAAGLDEILAKYPKEKEYLMPIIIEEHDKKGYISDPSIVKISEHLGMYPAQIESILS 
SYHYFPREHTIAn.MSICVHCHNCMNIKGQGRLLKTIQETYDIHETHGGVAKDGSFTL 

105 



Attorney Docket Number: H2042101-CIP 
Inventor: Harrison F. Dillon 



HTLNWLGYCVNDAPAMMKWCGTbryVETITGLLGDNroQRLKSLKNLKKEL^^ 
KNMREMKSQRNGNSYSCMOTQAPIAEATKKAVSMGPEKVIEEVFKShnLVGRGGA 
FRTGKKWESAYKTPASDKYWCNADEGLPSTYKDWCLLNNEAKRKEVFTGMGICA 
KTIGAKRCFlVm.RYEYTtNLVPALEQSIKDVQSTCPELADLKYEIRLGGGPYVAGEEN 
5 AQFESffiGRAPLPRKDRPGMFPTMEGLFHKPTVINNVETFFAIPHnQQGSQSFGEGKM 
PKLLSVTGDVDEPILIETNLNNYSLNHLLQEISAKDIVAAEIGGCTEPIIFGSKFDTLFGF 
GRGTLNAVGSWLFNS SCDLGKIYENKLKFMAEESCKQCVPCRDGSYIFHRAFKELR 
DTGKSSYNMRALAVASESAARSSICAHGKALESLFKSACDFMNKTKPIYQPHSTYHQ 

10 

SEQroNOlOO: T vaginalis gi:l 171117 AAC47159.1 

MLASSATAMKGFANSLRMKDYSSTGINFDMTKCINCQSCVRACTNIAGQNVLKSLT 
WGKSVVQTVTGKPLAETNCISCGQCTLGCPKFTIFEADAINPVKEVLTKKNGRIAVC 

1 5 QIAPAIRINMAEALGWAGTISLGKVVTALKRLGFDYWDTNF AADMTIVEEATEL^ 
RLSDKNAVLPMFTSCCPAWVNYVEKSDPSLIPYLSSCRSPMSMLSSVIKNVFPKKIGT 
TADKnT^VAIMPCTRKKDEIQRSQFTMKDGKQETGA\a.TSRELAKMKEAKINFKEL 
PDTPCDNFYSEASGGGAIFCATGGVMEAAVRSAYKFLTKKELAPIDLQDVRGVASG 
VKLAEVDIAGTKVKVAVAHGIKNAMTLIKKIKSGEEQFKDVKFVEVMACPGGCVVG 

20 GGSPKAKTKKAVQARLNATYSIDKSSKHRTSQDNPQLLQLYKESFEGKFGGHVAHH 
LLHTHYKNRKVNP 

SEQroNOlOl: C. acetobutylicum gi:15893326 NP_346675.1 

25 MKTIILNGNEVHTDKDITILELARENNVDIPTLCFLKDCGNFGKCGVCMVEVEGKGF 
RAACVAKVEDGMVINTESDEVKERIKKRVSMLLDKHEFKCGQCSRRENCEFLKLVI 
KTKAKASKPFLPEDKDALVDNRSKATVIDRSKCVLCGRCVAACKQHTSTCSIQnKKD 
GQRAVGTVDDVCLDDSTCLLCGQCVIACPVAALKEKSHIEKVQEALNDPKKHVIVA 
MAPSVRTAMGELFmGYGKDVTGKLYTALRMLGFDKVFDINFGADMTINfEEATEL 

30 LGRVKNNGPFPMFTSCCPAWVRLAQNYHPELLDNLSSAKSPQQIFGTASKTYYPSISG 
lAPEDVYTWIMPCNDKKYEADffFMETNSLRDIDASLTTRELAKMIKDAKIKFADLE 
DGEVDPAMGTYSGAGAIFGATGGVMEAAIRSAKDFAENKELENVDYTEVRGFKGIK 
EAEVEIAGNKLNVAVINGASNFFEFMKSGKMNEKQYHFIEYMACPGGCINGGGQPH 
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VNALDRENVDYRKLRASVLYNQDK]m.SKRKSHDNPAIIKMYDSYFGKI<}EGl^ 
KLLHVKYTKDKNVSKHE 



5 

SEQIDN0102: gil30069 P29166 c.p? 

MKTIIINGVQF^rroEDTTILKFARDNNIDISALCFLNNCNNDINKCEICT^^ 
TACDTLIEDGMIINTNSDAVNEKIKSRISQLLDIHEFKCGPCNRRENCEFLKLVIKYKA 

1 0 RASia>FLPKDKTEYVDERSKSLTVDRTKCLLCGRC W ACGKNTETYAMKFLNKNGK 
TIIGAEDEKCFDDTNCLLCGQCIIACPVAALSEKSHMDRVKNALNAPEKHVIVAMAP 
SVRASIGELFNMGFGVDWGKIYTALRQLGFDKIFDINFGADNfTlMEEATELVQRIEN 
NGPFPMFFSCCPGWVRQAENYYPELLNNLSSAKSPQQIFGTASKTYWSISGLDPKNV 
FrVTVMPCTSKKFEADRPQMEimGLRDroAVITraELAKMIKDAKIPFAia.E 

15 PAMGEYSGAGAIFGATGGVMEAALRSAKDFAENAELEDIEYKQVRGLNGIKEAEVEI 
NNNKYNVAVINGASNLFKFMKSGMINEKQYIffIE^^ACHGGCWGGGQPHV^P 
LEKVDIKKVRASVLYNQDEHLSKRKSHEOTALVKMYQNYFCHCPGEGRAHEILHFKY 
KK 

20 SEQ ID NO 1 03 : gi: 1 30070 P07598 ? 

MSRTVMERIEYEMHTPDPKADPDKLHFVQIDEAKCIGCDTCSQYCPTAAIFGEMGEP 
HSIPHIEACINCGQCLTHCPENAIYEAQSWVPEVEKKLKDGKVKCIAMPAPAVRYAL 
GDAFGMPVGSVTTGKMLAALQKLGFAHCWDTEFTADVnWEEGSEFVERLTKKSD 
25 MPLPQFTSCCPGWQKYAETYYPELLPHFSTCKSPIGMNGALAKTYGAERMKYDPKQ 
Vm^SIMPCIAKKYEGLRPELKSSGMRDroATLTrRELAYMIKKJ^GroFAKI^DGI^ 
DSLMGESTGGATIFGVTGGVMEAALRFAYEAVTGKKPDSWDFKAVRGLDGIKEATV 
NVGGTDVKVAVVHGAKRFKQVCDDVKAGKSPYHFIEYMACPGGCVCGGGQPVMP 
GVLEAMDRTFTRLYAGLKKRLAMASANKA 

30 

SEQ ID NO 104: ? gi:1345094 AAC47160.1 

MLASSSRAAAMRWVDTSHNAIAFDMHKCINCQACWACKNVAGQSVLKSVKINEG 
KKKGVVQTWGKIXAETNCIGCGQCTLVCPTQAIHEKDALKQMNNIFKNKGDRILV 
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CQIAPAIRINMRW»WCSSRNSFHRQSRYSPQRLGFDYWDTNFGADLTIVEEATELL^ 
RLNDPKAVLPMFreCCPAWVNY\^KSYPQWMPHLSTCRSPIGMLSAVIKhrsaTK^ 
\a)PKRIFSVGIMPCTAKKDEAAREQIMnCSGLHETDLDITSRELAKM[KA^ 
PDTELDSPYAMATGGGAIFCATGGVMEAAVRSAYKFATGKELAPIEFVQVRGAEKGI 
5 KVGTVDINGREIKVAVAQGVKNAMSLIKKIEEGQDDVKGVVFCEVMACPGGCVGG 
GGSPRAKTKAAMNKIU.DATYRmRASKYRTPQDNTQLQPLYNATWVVSLVMD 

SEQIDNO105: gi:15644177 NP_229226.1 T. maritima 

10 

MKIYVDGREVnNDNERNU^ALKNVGffiffNLCYLSEASIYGACRMCLVEmGQnTS 
CTLKPYEGMKVKTNTPEIYEMRRI^ELILATHNRDCTTCDRNGSCKLQKYAEDFGI 
RKIRFEALKKEHVRDESAPVVRDTSKCILCGDCVRVCEEIQGVGVIEFAKRGFESVVT 
TAFDTPLffiTECVLCGQCVAYCPTGALSIRNDIDKLffiALESDKIVIGMIAPAVRAAIQE 

1 5 EFGIDEDVAMAEKLVSFLKTIGFDKVFDVSFGADLVAYEEAHEFYERLKKGERLPQF 
TSCCPAWVKHAEmYPQYLQNLSSVKSPQQALGTVIKKIYARKLGVPEEKIFLVSFM 
PCTAKKFEAEREEHEGIVDIVLTTRELAQLIKMSRIDINRVEPQPFDRPYGVSSQAGLG 
FGKAGGVFSCVLSVLNEEIGIEKVDVKSPEDGIRVAEVTLKDGTSFKGAVIYGLGKV 
KKFLEERKDVEIIEVMACNYGCVGGGGQPYPNDSRIREHRAKVLRDTMGIKSLLTPV 

20 ENLFLMKLYEEDLKI)EHTRHEILimYRPRRRYPEKDVEILPVPNGEKRTVKVCLGT 
SCYTKGSYEILKKLVDYVKENDMEGKIEVLGTFCVENCGASPNVIVDDKnGGATFEK 
VLEELSKNG 



25 SEQ ID NO 106: T vaginalis gi:19547859 AAK26191.1 

CIXjKWI^ACVTTVWDGLiaDTK5KMVKESVENNLKELLDCHDETCSSCVANHRC 
QFRDMNVAYSnCAETKEECSEEGIDESTNSIRLDTSKCVLCGRCIRACEEVAGQSAIIF 
GNRAKHMRIQPTFGQTLQDTSCIKCGQCTLYCPVGAITEKSQVKQALDILSNKGKKIS 
30 VXQVAPAVRVALSEAFGYKEGSVTTGKMVSALKALGFDYVYDTNYSADLTFVEEAG 

ELVQRLKNPNAVFPMFTSCCPA\^^VNYVEQSAPDFIPNLSSCRSPQGMLSSLVKNYLP 
KVLNIPVEDVLNFSIMPCTAKKDEIERPELRTKDGHKETDMVLTVRELVEMIKLSGID 
FNNLPDTPFDSIFGFGSGAGQIFAAT 
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SEQ IDN0107: Rnorvegjcus gi:34870568 XP_340771.1 

MASPFSGALQLTDLDDFIGPSQSCIKPVTVAKKPGSGIAKIHIEDDGSYFQVNPDGRSQ 
5 KLEKj\KVSLNDCLACSGCVTSAETILITQQSHEELRKVLDANKVAAPGQQRLVVVSV 
SPQSRASLAARFQLDSTDTARKLTSFFKKIGVHFVFDTAFARNFSLLESQKEFVQRFR 
EQANSREALPMLASACPGWICYAEKTHGNFILPYISTARSPQQVMGSLIKDFFAQQQL 
LTFDKIYHVTVMPCYDKKLEASRPDFFNQEYQTRDVDCVLTTGEVFRLLEEEGVSLS 
ELEPVPLDGLTRSVSAEEPTSHRGGGSGGYLEHVFRHAAQELFGIHVADVTYQPMRN 
1 0 KDFQEVTLEREGQVLLRFAVAYGFRMQNLVQKLKRGRCPYHYVE VM ACPSGCLNG 
GGQLKAPDTEGRELLQQVERLYSMVRTEAPEDAPGVQELYQHWLQGEDSERASHLL 
HTQYHAVEKINSGLSIRW 

15 SEQ ID NO 108: S.cerevisiae gi: 14336719 AAK61251.1 

MASPFSGALQLTDLDDFIGPSQVGSLQALLALAFLHTGNFSAAGCWEPDPWECIKPV 

KVEKRAGSGVAKIRIEDDGSYFQINQEKLGELELEPTFGIFLPYSPDGGTRRLEKAKVS 

LNDCLACSGCITSAETVLITQQSHEELKKVLDANKMAAPSQQRLVVVSVSPQSRASL 

20 AARFQLNPTDTARKLTSFFKKIGVHFVFDTAFSRHFSLLESQREFVRRFRGQADCRQA 
LPLLASACPGWICYAEKTHGSFILPHISTARSPQQVMGSLVKDFFAQQQHLTPDKJYH 
VTVMPCYDKKLEASRPDFFNQEHQTRDVDCVLTTGEVFRLLEEEGVSLPDLEPAPLD 
SLCSGASAEEPTSHRGGGSGGYLEHWRHAARELFGIHVAEVrYKPLRNKDFQEVTL 
EKEGQVLLHFAMAYGFRNIQNLVQRLKRGRCPYHYVEVMACPSGCLNGGGQLQAP 

25 DRPSRELLQFTVERLYGMVRAEAPEDAPGVQELYTHWLQGTDSECAGRLLHTQYHA 
VEKASTGLGIRW 

SEQ ID NO 109: C. perfringens gi: 1 83 1 1328 NP_563262.1 

30 MNKniNDKTIEFDGDKTILDLARENGFDIPVLCELKNCGNKGQCGVCLVEQEGNDRL 
LRSCAIKAKDGMVIKTDSEKVLEARKERVAELLDEHEFKCGPCKRRENCEFLKLVIK 
TKARAHKPFWADKSEYVDDRSKSIVLDRSKCVKCGRCVAACRTRTATNSIKFHRID 
GVRLVGPEELKCFDDTNCLLCGQCIAACPVDALSEKSHIERVQDALNDPEKHVIVAM 
APAVRTSMGELFKMGYGQDVTGKLYTALRELGFDKVFDINFGADMTIMEEATELIE 
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RKNNGPFPNlLTSCCPSWVREVEhrWJPELVENLSSAKSPQQIFGAASKTYYPQVADID 
FKKWTVTVJ^CTSKIOfEADRPEMEl^GIRl^AVITM 

EVDPAMGEYTGAGVIFGATGGVMEAALRTAKDFMENDNLD^fVDYEAVRGLAGIKE 
AEVEIAGNEYKLAWSGAANVFELVKSGKINDYHFIEVMACPGGCVNGGGQPfflSAE 
5 DSDIQDIREVRASVLYNQDKNLEKRKSHQNSALLKMYENYMGKPGaiGRAHELLHM 
KYKK 



SEQ ID NO 1 10: gi:4239873 BAA74726. 1 C. perfiingens 

10 

MNKnnSfDKTIEFDGDKTILDLARENGFDIPVLCELKNCGNKGQCGVCLVEQEa^RL 
UISCAKAKDGMVIKTDSEKVLEARKERVAELLDEHEFKCGPCKRR^CEFLKLVIK 
TKARAHKPFVVADKSEYVDDRSKSrVLDRSKCVKCGRCVAACRTRTATNSIKFHRID 
GVRLVGPEELKCFDDTNCLLCGQCIAACPVDALSEKSHIERVQEALNDPEKHVrVAM 

1 5 APAWTSMGELFmGYGQDVTGKLYTALRELGFDKVFDINFGADMTIMEEATELIE 
RIKNNGPFPMLTSCCPSWVREVE^m'PELVENLSSAKSPQQIFGAASKTYYPQVADID 
PKKWTVTVMPCTSKKFEADRPEMENEGIRNIDAVITTRELARMIKAAI^ 
EVDPAMGEYTGAGVIFGATGGVMEAALRTAKDFMENDNLDNVDYEAVRGLAGIKE 
AEVEIAGNEYKLAVVSGAANVFELVKSGKJNDYHFIEVMACPGGCVNGGGQPHISAE 

20 DSDKJV4DIREVRASVLYNQDKNLEKRK5HQNSALLKN4YESYMGKPGHGRAHELLH 
MKYKK ■ 



25 SEQ ID NO 111: C. tetani gi:28212003 NP_782947.1 

N!IVFENQUaa.KYLVLKEVAKMTLEDRLGEEDIERISFDIIKGDKAEYRCCVYKERAI 
VYERAKLATGCLPNGQVAEEFVHVEDDDQIXYVTOAACDKCPINKYVVTEACRGCLQ 
HKCMEVCPAGSINRAAGKAYINHETCKECGLCESACPYNAIAEVMRPCRRACPTGA 
30 LQMNLEDNKATINKEDCINCGSCMSVCPFGAISDKSYWDrrKALKNNKKVYAMVAP 

AITGQFGKDVSVGKMKNAFKAMGFEDMLEVACGADAVAAHESEEFIERLESGKKY 

MTTSCCPGFLGYIEKKFPDQLENVSNTVSPMVAIGRMIKKEYEDSVVVFVGPCTAKK 

AEKRKGIKDAVDYVMTFEEIAALMGAFEIDPAECEEEDINDGSNYGRGFAQGGGVY 
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SAIQNCIKDKEGnOWLRVSC^DQIKRAMIMAKVGKLSElSIFEGMMCEGGCIGGPAT 
MVSAVKAKAPLMKFSKSSTTKDVKDNEVLDKYKDDSIMER 

SEQIDN0112: C. tetani gi:28209953 NP_780897.1 

5 

MHNDYREIFKRLSKSYYDDTFEKEVENILSSHSMDREKLAKIISILCGVNIEHSENYIS 
M.KNAIKNYTASAEKVWKLPCSTQCAKDGDIICEKSCPVNAIFRDPNDNNIYnslDEL 
CLDCGLCVRNCPSGSILDKKEFIPLAELLKSESIVIAAVAPAMGQFGENTTINQLRTAF 
KKLGFTDMVEVAFFADMLTLKEAVEYDHFVKDEQDFMITSCCCPMWVGMLKKVY 
10 NDLVKYVSPSVSPMIAAGRVLKLLNPNCKVVFVGPCIAKKAEAREKDLLGDIDFVLT 
FTELRDBFDVFDIQPENLEEDFSSEYASKGGRLYARTGGVSIAVSEAIEKLFPNKYKFL 
KTIQADGVKGCKSLLDKIKQEDISANFVEGMGCVGGCVGGPKVIIDPSEGRNAVNNF 
AENSSKVSVDSNCMNDILSKININSVEDFKDKDKISIFEREFK 

15 SEQIDN0113: Pyrococcus fiiriosus delta (small) subunit GI:562776 X75255 

MGKVRIGFYALTSCYGCQLQLAMMDELLQLIPNAEIVCWFMIDRDSIEDEKVDIAFIE 

GSVSTEEEVELVKKIRENAKIVVAVGACAVQGGVQSWSEKPLEELWKKVYGDAKV 
KFQPKKAEPVSKYIKVDYNIYGCPPEKKDFLYALGTFLIGSWPEDIDYPVCLECRLNG 
20 HPCILLEKGEPCLGPVTRAGCNARCPGFGVACIGCRGAIGYDVAWFDSLAKVFKEKG 
MTKEEIIERMKMFNGHDERVEKMVEKIFSGGEQ 

SEQIDN0114: Eschoichiacoli small subunit GI:2668498 M63654 

25 ' 

MSPVLTQHVSQPITLDEQTQKMKRHLLQDIRRSAYVYRVDCGGCNACEIEIFAAITPV 
FDAERFGIKWSSPRHADILLFTGAVTRAMRMPALRAYESAPDHKICVSYGACGVGG 
CflFHDLYSVWGGSDTrVPIDVWIPGCPPTPAATfflGFAVALGLLQQKIHAVDYRDPTG 
VmQPLWPQIPPSQRIAffiREARRLAGYRQGREICDRLLRHLSDDPTGNniVNTWLRD 
30 ADDPRLNSrVQQLFRVLRGLHD 



SEQ ID NO 1 15: Methanothemiobacter themiautotrophicus GI:149717 small subunit 
J02914 
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MAEENAKPRIGYTHLSGCTGDAMSLTEN¥T)ILAEIXTNMVDIVYGQTLVDLWEMP 
MDLALVEGSVCLQDEHSLHELKELREKAKLVCAFGSCAQTGCFTRYSRGGQQAQPS 
HESFWIADLroVDLAIPGCPPSPEIIAKAWALLNNDMEYLQPMLDLAGYTEACGCD 
5 LQmVVNQGLCTGCGTCAMACQTRALDMTNGRPELNSDRCIKCGICYVQCPRSWW 
PEEQIKKELGL 



1 0 SEQ ID NO 1 16. Y13764 Methanosarcina barkeri GI:2463276 sirall subunit 

MANKKLGHVHLSGCTGCLVSVADNYQGFLKILDDYADLVYCLTLADVRHIPEMDV 
ALVEGSVCIQDRESVEDIKETRKKSRIWALGSCASYGNITRFCRGGQHNHPQHESYL 
PIGDLIDVDVYIPGCPPSPELIRNVAIMAYLLLEGNEEQKDLAGRYLKPLMDLAKRGT 
1 5 TGCFCDLMDDVmQGLCIGCGICAASCPVRAITHEFGKPQGDLNLCIKCGSCYGACPR 
SFFNPDVISEFESINEIIAGALKEGEKDD 



20 SEQ ID NO 117: U65510 Rhodospirillum rubnim 01:1515468 small subunit 

MNFLSRMSKKSPWLYRINAGSCNGCDVELATTACIPRYDVERLGCQYCGSPKHADI 

VLVTGPLTARVKDKVLRVYEEIPDPKVTVAIGVCPISGGVFRESYSIVGPIDRYLPVDV 

NVPGCPPRPQAIIEGIAKAIEIWAGRI 

25 

SEQ ID NO 1 18: X75255. Pyrococcus fiiriosus alpha Oarge) subunit GI:562777 

MKNLYLPmDmARVEGKGGVEinGDDGVKEVKLNIIEGPRFFEArnGKKLEEALAIY 
30 PRICSFCSAAHKLTALEAAEKAVGFVPREEIQALREVLYIGDMIESHALHLYLLVLPD 
YRGYSSPLKMVNEYKREIEIALKLKNLGTWMMDILGSRAIHQENAVLGGFGKLPEKS 
VLEKMKAELREALPLAEYTFELFAKLEQYSEVEGPITHLAVKPRGDAYGIYGDYIKA 
SDGEEFPSEKYRDYIKEFVVEHSFAKHSHYKGRPFMVGAISRYINNADLLYGKAKEL 
YEANKDLLKGTNPFANNLAQALEIVYFIERAIDLLDEALAKWPIKPRDEVEIKDGFGV 
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STIEAPR(HLWALKVENGRVSYADIITPTAFNIJ^MMEEH\aiMMAEBfflYT^ 
KILAEMWRAYDPCISCSVHWRL 



5 

SEQ ID NO 1 19: M63654. Escherichia coli large subunit GI: 1 616956 

MNVNSSSNRGEAILAALKTQFPGAVLDEERQTPEQVTTTVKINLLPDVVQYLYYQHD 
GWLPVLFGNDERTLNGHYAVYYALSMEGAEKCWIVVKALVDADSREFPSVTPRVP 

1 0 AA WGEREIRDNTi^GLIP VGLPDQRRLVLPDDWPEDMHPLRKDAMDYRLRPEPTTDS 
ETYPFINEGNSDARVIPVGPLHrrSDEPGHFRLFVDGEQrVDADYRLFYV 
mGMEKLAETRMGYNEVm.SDRVCGICGFAHSVAYTNSVENALGIEWQRAHTIRS 
ILLEVERLHSHLLNLCa.SCHFVGFDTGFMQFFRVREKSMTMAELLIGSRKTYGLNLIG 
GVRRDIUCEQRLQTLKLVREMRADVSELVEMLLATPNMEQRTQGIGILDRQIARDLR 

1 5 FDHPYADYGNIPKTLFTFTGGDWSRVMWVKETFDSLAMLEFALDNMPDTPLLTEG 
FSYKPHAFALGFVEAPRGEDVHWSMLGDNQKLFRWRCRAATYANWPVLRYMLRG 
NTVSDAPLnGSLDPCYSCTDRVTLVDVRKRQSKTVPYKEIERYGIDRNRSPLK 



20 SEQ ID NO 120: J02914. Methanothermobacter thermautotrophicus large subunit 
GI:551889 

MSERIVISPTSRQEQiAELVMEVDDEGIVTKGRYFSITPVRGLEKIVTGKAPETAPVrV 
QRICGVCPIPHTLASVEAroDSLDIEWKAGRLLRELTLAAHHVNSHAlHHFLIAPDFV 

25 PENIMADAmSVSEIRKNAQYVVDMVAGEGIHPSDVMGGMADNITELARKRLYARL 
KQLKPKVDEHVELMIGLIEDKGLPKGLGVHNQPTLASHQIYGDRTKFDLDRFTEVMP 
ESWYDDPEIAKRACSTIPLYDGRNVEVGPRARMVEFQGFKERGVVAQHVARALEM 
KTALARAIEILDELDTSAPVRADFDERGTGKLGVGAIEGPR(H.DVHMAQVENGKIQF 
YSALVPTIWffTMGPATEGFHHEYGPHVIRAYDPCLSCATHVNlVVDDEDRSVIRDE 

30 MVRL 



SEQ ID NO 121 : Y13764 Methanosarcina barkeri GI:2463274 large subunit 
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]SnKVVHSPTTimECHSKLTLKVNDEGIVERGDWLSlTPWGIEia.AIGKTNlDQVPKI 
ASRVCGICPIAHTLAGffiAMEASIGCEIPKDAKLLRVILHAANRLHSHALHNE.ILPDFY 
ffDTETKINPFSKEQPLRSVAVRinUREIAQTIGAVAGGEAIHPSNPRVGGMYRNVSSR 
AKQKIADLAKE(a.VLAHEQMEFNnEVIR]^QDREFVEVAGKQIPLPKTLGYHNQGV 
5 MATAPMYGSSSLDEKPMWDFTRWRETRPWDWYMSEETIDLED 

SSYPIGGTTKVGTKVNPRMEACNIYPTYDGQPVEVGPRARLATFKHFTEKGTFAQHI 
ARQMEYTDCYYTILNCLENLDTSGKVLADTIPLGNGSMGWAANEAPRGTDVHLAR 
W^GKVLRYEMLVPTTWNFPTCSRALTGAPWQIAEMVIRAYDPCVSCATHMIVVNE 
EDRIVAQKLMQW 

10 

SEQIDN0122: U655I0 Rhodospirillum rabrum large subiinit GI:1498746 

MSTYTIPVGPLHVALEEPMYFRIE\^GEKWSVDITAGHVHRGIEYLATKRNIYQNIV 
1 5 LTERVCSLCSNSHPQTYCM ALESITGMVWPRAQYLRVIADETKRVASHMITSfVAILA 
HIVGFDSLFMH\avlEAREIMQDTKEAVFGNRMDIAAMAIGGVKYDLDKDGRDYFIG 
QLDKLEPTLRDEIIPLYQTNPSIVDRTRGIGVLSAADCVDYGLMGPVA 
RGSGHAYDVRKQAPYAVYDRLDFEMALGEHGDVWSRAMVRWQEALTSIGLIRQCL 
RDMPDGPTKAGPVPPIPAGEAVAKTEAPRGELIYYLKTNGTDRPERLKWRVPTYMN 
20 WDALNVMMAGARISDIPLIVNSIDPCISCTER 

SEQ ID NO 123: Chlamydomonas reinhardtii background with higher molecular weight 
variant amino acid (naturally occurring in D. vulgaris) (TIMEE->TIWEE) 

25 MALGLLAELRAGQAVACARRTNAPAHPAAWPCLPSRAGKFFNLSQKVPSSQSARG 
SmVAATATDAVPHWKLALEELDKPKDGGRKVLIAQVAPAVRVAIAESFGLAPGA 
VSPGKLATGLRALGFDQVFDTLFAADLTTWEEGTELmRLKEHLEAHPHSDEPLPMF 
TSCCPGWAMMEKSYPELIPFVSSCKSPQMMMGAMVKTYLSEKQGIPAKDIVMVSV 
MPCVRKQGEADREWFCVSEPGVRDVDHVITTAELGNIFKERGINLPELPDSDWDQPL 

30 (aGSGAGVLFGTTGGVMEAALRTAYEIVrKEPLPRLNLSEVRGLDGIKEASVTLVPA 
PGSKFAELVAERLAHKVEEAAAAEAAAAVEGAVKPPIAYDGGQGFSTDDGKGGLKL 
RVAVANGLGNAKKLIGKMVSGEAKYDFVEIMACPAGCVGGGGQPRSTDKQITQKR 
QAALYDLDERNTLRRSHENEAVNQLYKEFLGEPLSHRAHELLHTHYVPGGAEADA 
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SEQIDN0124: GAGVIFGATGGVMEAALRT 

SEQIDN0125: GGGAIFCATGGVMEAAVRS 

SEQIDN0126: GGATIFGVTGGVMEAALRF 

SEQIDN0127: GAGAIFGATGGVMEAALRS 
5 SEQIDN0128: GAGAIFGATGGVMEAAIRS 

SEQ ID NO 129: GAAVIFGVTGGVMEAALRT 

SEQ ID NO 130: GAGQIFAATGGVMEAASRT 

SEQ ID NO 13 1 : GAAWGTTGGVMEAALRT 

SEQIDN0 132: GAAPIFGVTGGVIEAALRT 
10 SEQ ID NO 133: GAGYIFGITGGVMEAALRS 

SEQ ID NO 134: GAGWGATGGVMEAAIRT 

SEQ ID NO 135: SAC»«.FGVTGGVMEAAIRT 

SEQ ID NO 136: GAGAIFGATGGVMEAALRT 

SEQ ID NO 137: GAGVLFGTTGGVMEAALRT 
15 SEQ ID NO 138: GAAALFGVTGGVMEAALRT 

SEQIDN0139: GAGVLFGTTGGVMEAAVRT 

SEQ ID NO 140: GAGTTFGTTGGVMEAALRT 

SEQ ID NO 141: GGGVLFGTTGGVMEAALRT 

SEQ ID NO 142: TIMEE 
20 SEQ ID NO 143: TIVEE 

SEQ ID NO 144: TIWEE 

SEQ ID NO 145: TICEE 

SEQ ID NO 146: VIMEE 

SEQ ID NO 147: TARLE 

25 

SEQ ID NO 148: proiiK)ter of the Ihcbl gene: Genbank accession number Yl 6833: 
gcagttgggtcaggggctggcgacgc^gctgacgcgcaagtgaatggcccaacaagtcgcctcgcggtcgctgtcggc 
gccaaacccgcagctgcatccaccagattcacttgttagatcgacctaggttgcgggaccggaggcggctcgctgtgcaagcgcggt 
gacctcgtacggcggcatggatcgccatctcgattcgcgcggcagaatcgggccccgcgcacatttaagccgcgggcgagactcat 
30 ttcgtta 

SEQ ID NO 149: Selectable marker: (Lumbreras, Plant J (1998) 14(4): 441 -447) 

gccagaaggagcgcagccaaaccaggatgatgtttgatggggtatttgagcacttgcaacccttatccggaagccccctggcccaca 

aaggctaggcgccaatgcaagcagttcgcatgcagcccctggagcggtgccctcct^taaaccggccagggggcctatgttcttta 
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cttttttacaagagaagtcactcaacatcttaaaatggccaggtgagtcgac^ 
mgacttgcaacgcccgcattgtgtcgacgaaggctmggctcctagtcgctgtctc^ 
mgcaggatggccaagctgaccagcgccgttccggtgctcaccgcgcgcgacgtcgccggagcggtcga 
ctcgggttctcccgggacttcgtggaggacgacttcgcxggtgtggtccgggacgacgtgaccctgttcato 
5 caggtgagtcgacgagcaagcccggcggatcaggcagcgtgcttgcagamgacttgcaacgcccgcattgtg^^ 
ttggctcctctgtcgctgtctcaagcagcatctaaccctgcgtcgccgmccamgcaggaccaggtgg^^ 
ctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcggaggtcgtgtccacgaacttccgggacg 
gccatgaccgagatcggcgagcagccgtgggggcgggagttcgccctgcgcgacccggccggcaaagcgtgcacttcgtggcc 
gaggagcaggactaaccgacgtcgacccactctagaggatcgatccccgctccgtgtaaatggaggcgctcgttg 
1 0 ccccctgacgaacggcggtggatggaagatactgctctcaagtgctgaagcggtagcttagctccccg^ 
caacacgtaaaaagcggaggagtmgcaaltttgttggttgtaacgatcctccgttgam 
tatttgaagcttaattaactcgagggggggcccggtacc 

15 SEQ ID NO 150: 3' untranslated region of RBCS2 gene 

ccgacgtcgacccactctagaggatcgatccccgctccgtgtaaatggaggcgctcgttgatctgagccttgccccctgacg^ 

ggtggatggaagatactgctctcaagtgctgaagcggtagcttagctccccgtttcgtgctgatcagtctttttcaacacgtaaaaag^ 

gaggagttttgcaatmgttggttgtaacgatcctccgttgatmggcctctttctccatgggcgggrtgggcgtatttg 

20 

SEQ ID NO 151 : 3' untranslated region of RBCS2 gene - promoter of the Ihcbl gene 
ccgacgtcgacccactctagaggatcgatccccgctccgtgtaaatggaggcgctcgttgatctgagccttgccc^ 
ggtggatggaagatactgctctcaagtgctgaagcggtagcttagctccccgtttcgtgctgatcagtct^ 
gaggagttttgcaatmgttggttgtaacgatcctccgltgattttggcxtctttctccatgggc^^^ 
25 ggggctggcgacgcgctgctgacgcgcaagtgaatggcccaacaagtcgcctcgcggtcgctgtcggcgccaaaccc^^ 
atccaccagattcacttgttagatcgacctaggttgcgggaccggaggcggctcgctgtgcaagcgcggtgac^^ 
tggatcgccatctcgattcgcgcggcagaatcgggcc(xgcgcacatttaagccgcgggcgagactcat^^ 

SEQ ID NO 152: 5 Unique sequence a: 5' atccgtagttatccttatggccatcttagc 3' 

30 

SEQ ID NO 153: Unique sequence b: 5' cgtgcatcgattaacagcttctggacctga 3' 
SEQ ID NO 154: Unique sequence c: 5' ttaaacgtcgtacgtccaagtataactaag 3' 
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SEQIDNOISS: Unique sequence d: S' aatctgatacatgctattcagatcttacaa 3' 
SEQIDN0156: Unique sequence e: 5' tcttccatcgtaaatctagcatcgattagc 3' 
5 SEQIDN0157: Unique sequence f: 5' atctgtaataatctagtcgaggcattcaag 3' 
SEQIDN0158: Unique sequence g: 5' aactggcttaaatcgttaacaatcgtgtga 3' 
SEQIDN0159: Unique sequence h: 5' gatttaacataactgtcgattaccgtgcga 3' 

10 

SEQIDNO160: Unique sequence i: 5' tatgcttgacaatcgtaatcctggtgacaa3' 

SEQ ID NO 161: Unique sequence j: 5' taacaagaatctggctaatcaatcgatgca3' 

15 SEQ ID NO 162: Unique sequence k: 5' gtagtcggaatagttactaacgaggattcg 3' 

SEQ ID NO 163: Unique sequence 1: 5' aaatgtctactcgactagtaaatcgtaact 3' 

SEQ ID NO 164: NonshufEIed segment I: 
20 gcagttgggtcaggggctggcgacgcgctgctgacgcgcaagtgaatggcccaacaagtcgcctcgcggtcgctgtcggc 

gccaaacccgcagctgcatccaccagattcacttgttagatcgacctaggttgcgggaccggaggcggctcgctgtgcaagcgcggt 

gacctcgtacggcggcatggatcgccatctcgattcgcgcggcagaatcgggccccgcgcacatttaagccgcgggcgaga^^^ 

ttcgttaatccgtagttatccttatggccatcttagc 

25 SEQ ID NO 1 65 : Nonshuffled segment II: 

cgtgcatcgattaacagcttctggacctgaccgacgtcgacccactctagaggatcgatccccgctccgtgtaaatggaggcg^^ 
gatctgagccttgccccctgacgaacggcggtggatggaagatactgctctcaagtgctgaagcggtagcttagctcccc 
tgatcagtctttttcaacacgtaaaaagcggaggagttttgcaattttgttggttgtaacgatcctccgttga^ 
cgggctgggcgtatttggcagttgggtcaggggctggcgacgcgctgctgacgcgcaagtgaatggcccaacaagtcgcctcgcg 

30 gtcgctgtcggcgccaaacccgcagctgcatccaccagattcacttgttagatcgacctaggttgcgggaccggaggcggctcgctgt 
gcaagcgcggtgacctcgtacggcggcatggatcgccatctcgattcgcgcggcagaatcgggccccgcgcacatttaagccgcg 
ggcgagactcatttcgttattaaacgtcgtacgtccaagtatgactaag 

SEQ ID NO 166: Nonshuffled segment III: 
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aat(^gatacatgctancagatcttacaaccgacgtcgacccactctagaggatcgatc^ 

atctgagccttgccccctgacgaacggcggtggatggaagatactgctctcaagtgc^ 

gatcagtcttmcaacacgtaaaaagcggaggagtmgcaatmgttggttgtaacgatcctcc^ 

gggctgggcgtatttggcagttgggtcaggggctggcgacgcgctgctgacgcgcaagtgaatggcccaa^ 
5 cgctgtcggcgccaaacccgcagctgcatccaccagattcacttgttagatcgacctaggttgcgggaccggaggc^^ 
caagcgcggtgacctcgtacggcggcatggatcgccatctcgattcgcgcggcagaatcgggccccgcgcacatt^^ 
gcgatcttccatcgtaaatctagcatcgattagc 

SEQ ID NO 167: Nonshuffled segment IV: 
1 0 atctgtaataatctagtcgaggcattcaagccgacgtcgacccactctagaggatcgatccccgctrc 
gatctgagccttgaxcctgacgaacggcggtggatggaagatactgctctcaagtgctgaagcgg^ 
tgatcagtcttmcaacacgtaaaaagcggaggagtmgcaattttgttggttgtaacgatcctcc^ 
cgggctgggcgtatttg 

15 SEQ ID NO 168: Nonshuffled segment V: 

gccagaaggagcgcagccaaaccaggatgatgtttgatggggtatttgagcacttgcaacccttatccggaagccccctgg^ 
aaggctaggcgccaatgcaagcagttcgcatgcagcccctggagcggtgccctcctgataaaccggccagggggcctatg^tcttta 
cttttttacaagagaagtcactcaacatcttaaaatggccaggtgagtcgacgagcaagcccggcggatcaggcagcgtgcttgc^^ 
tttgacttg(mcgc(xgcattgtgtcgacgaaggcttttggctcctctgtcgctgtctcaagcagcatctaa^ 

20 tttgcaggatggccaagctgaccagcgccgttccggtgctcaccgcgcgcgacgtcgccggagcggtcgagttctggaccgaccg^ 
ctcgggttctcccgggacttcgtggaggacgacttcgccggtgtggtccgggacgacgtgaccctgttcatc^^ 
caggtgagtcgacgagcaagcccggcggatcaggcagcgtgcttgcagatttgacttgcaacgcccgcattgtg^^ 
ttggctcctctgtcgctgtctcaagcagcatctaaccctgcgtcgccgttt(xatttgcaggac^ 
ctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcggaggtcgtgtccacgaact^ 

25 gccatgaccgagatcggcgagcagccgtgggggcgggagttcgccctgcgcgacccggccggcaart 
gaggagcaggactaaccgacgtcgacccactctagaggatcgatccccgctccgtgtaaatggaggcgc^^ 
ccccxtgacgaacggcggtggatggaagatactgctctcaagtgctgaagcggtagcttagctccccgttt 
caacacgtaaaaagcggaggagttttgcaattttgttggttgtaacgatcctccgttgattttggcctcm 
tatttgaagcttaattaactcgagggggggcccggtacc 

30 

SEQ ID NO 169: Nonshuffled segment VI: 

gcagttgggtcaggggctggcgacgcgctgctgacgcgcaagtgaatggcccaacaagtcgcctcgcggtcgctgtcggc 
gccaaacccgcagctgcatccaccagattcacttgttagatcgacctaggttgcgggaccggaggcggctcgctgtgcaagcgcggt 
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gacctcgtacggcggcatggatcgc(»tctcgattcgcgcggcagaatcgggccccgcgcaM 
ttcgttaaactggcttaaatcgttaacaatcgtgtga 

SEQ ID NO 170: Nonshuffled segment VII: 
5 gatttaacataactgtcgattaccgtgcgaccgacgtcgacccactctagaggatcgatccccgctccgt^ 
gatctgagccttgccccctgacgaacggcggtggatggaagatactgctctcaagtgctgaagcggtagc^ 
tgatcagtcttmcaacacgtaaaaagcggaggagtmgcaatmgttggttgtaacgatcctccg^ 
cgggctgggcgtatttggcagttgggtcaggggctggcgacgcgctgctgacgcgcaagtgaatggcccaa^ 
gtcgctgtcggcgccaaacccgcagctgcatccaccagattcacttgttagatcgacctaggttgcggga^ 
10 gcaagcgcggtgacctcgtacggcggcatggatcgccatctcgattcgcgcggcagaatcg^ 
ggcgatatgcttgacaatcgtaatcctggtgacaa 

SEQ ID NO 171: Nonshuffled segment VIII: 

taacaagaatctggctaatcaatcgatgcaccgacgtcgacccactctagaggatcgatccccgctcc^^ 
1 5 gatrtgagccttgccccctgacgaacggcggtggatggaagatactgrtctcaagtgctgaagcggtagctt 
tgatcagtctttttcaacacgtaaaaagcggaggagttttgcaattttgttggttgtaacgatcctcc^ 
cgggctgggcgtatttg 

SEQ ID NO 172: Chlamydomonas reinhardtii ferredoxin; Genbank accession number 
20 L10349 

atggccatggctatgcgctccaccttcgccgcccgcgttggcgctaagcccgctgtccgcggtgctcgccccgccag 
tgcatggcctacaaggtcaccctgaagaccccttcgggcgacaagaccattgagtgccccgctgacacctacatcct^^ 
gaggaggccggcctggacctgccctactcttgccgcgctggtgcttgct(x:agctgcgccggcaag^^ 
accagtcggaccagtcdtcctggacgatgcccagatgggcaacggcttcgtgctgacctgcgtgg^ 
25 ccatccagacccaccaggaggaggccctgtactaa 

SEQ ID NO 173: Chlamydomonas reinhardtii hydrogenase; Genbank accession number 
AF289201 

atgtcggcgctcgtgctgaagccctgcgcggccgtgtctattcgcggcagctcctgcagggcgcggcaggtcgccccccgcgctcc 
30 gctcgcagccagcaccgtgcgtgtagcccttgcaacacttgaggcgcccgcacgccgcctaggcaacgtcgcttgcgcggctgccg 
cacccgctgcggaggcgcctttgagtcatgtccagcaggcgctcgccgagcttgccaagcccaaggacgaccccacgcgcaagca 
cgtctgcgtgcaggtggctccggccgttcgtgtcgctattgccgagaccctgggcctggcgccgggcgccaccacccccaagcagc 
tggccgagggcctccgccgcctcggctttgacgaggtgtttgacacgctgtttggcgccgacctgaccatcatggaggagggcagcg 
agctgctgcaccgcctcaccgagcacctggaggcccacccgcactccgacgagccgctgcccatgttcaccagctgctgccccggc 
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tggatcgctatgctggagaaatcttacccggacctgatcccctacgtg^ 
gtcaagtcctacctagcggaaaagaagggcatcgcgccaaaggacatggtcatggtgtcc^^ 
ggaggctgaccgcgactggttctgtgtggacgccgaccccaccctgcgccagctggaccacg^^ 
acatcttcaaggagcgcggcatcaacctggccgagctgcccgagggcgagtgggacaatccaatgggcgt^ 

5 gcgtgctgttcggcaccaccggcggtgtcatggaggcggcgctgcgcacggcctatgagctgttcacgggcacgccgc^^^ 
cctgagcctgagcgaggtgcgcggcatggacggcatcaaggagaccaacatcaccatggtgcccgcgcccgggtccaagmg 
gagctgdgaagcaccgcgccgccgcgcgcgccgaggccgccgcgcacggcacccccgggccgctggcctgggacggcggc 
gcgggcttcaccagcgaggacggcaggggcggcatcacactgcgcgtggccgtggccaacgggctgggcaacgccaagaagrt 
gatcaccaagatgcaggccggcgaggccaagtacgactttgtggagatcatggcctgccccgcgggagtgtgggcggcg 

1 0 cagccccgctccaccgacaaggccatcacgcagaagcggcaggcggcgctgtacaacctggacg^ 

agccacgagaacccgtccatccgcgagctgtacgacacgtacctcggagagccgctgggccacaaggcg^^ 
acccactacgtggccggcggcgtggaggagaaggacgagaagaagtga 

SEQ ID NO 174: Clostriduim pasteuraniim hydrog^ase; Genbank accession number 
15 M81737 

atgaaaacaataattataaatggtgtacagmaatactgatgaagacactactatattaaaam^ 
gcactgtgtttmaaataattgtaataatgacalaaataagtgtgaaatatgtactgtagaggtagagggtact^ 
gatacattaattgaggatggtatgattataaacacaaattccgatgctgtcaacgaaaaaattaaatctagaatatctcaatt 
catgaattcaaatgtggtccttgcaatagaagagaaaactgtgaattcttaaaacttgttataaaatataaagcaagagcttct^ 

20 ttacctaaagataagactgaatatgtagatgaaagaag^aaatcattaactgtagataggacaaaatgcttattatgtggaagat 
tgcctgtggaaaaaatactgaaacctatgcaatgaaatttttaaacaaaaatggtaaaactataattggagcagaggatgaa^ 
gatgatactaattgtctattatgtggtcaatgtataatcgcctgtccagtagcagcaltatcggaaaaalcacaca^ 
tgccttaaatgcccctgaaaaacatgtaatagtagctatggctccatctgtcagagcttctataggtgaacttt^ 
gacgtaacaggaaaaatttatactgctttaagacagcttggamgataaaataltcgatataaacttcggagc^^ 

25 agaggctacagaattagttcaaagaatagagaataatggacctUcccaatgtttacatcttgctgcccag^ 
aaaftattatcctgaattactaaataalctttcatcagctaaatcacctcaacaaatttftgg 
gtcttgacccaaagaatgtamactgtaacagttatgccctgtacncaaaaaaamgaagcagatagaccaca^ 
gcctaagagatatagatgctgttataactactcgagaattagcaaaaatgattaaagatgctaaaataccamgct^ 
gaagcagaccctgctatgggagaatacagcggtgctggtgccatamggtgcaactggcggagttatggaagcagcttta^ 

30 caaaagactttgctgaaaacgctgaacttgaagatatagaatataagcaagttagaggattaaatggtataaaagaagctgaagtagM 
ataaataacaacaaatataatgtagctgttataaatggtgcttcaaatttatttaagmatgaaatctggtatgattaacg 
mcatagaagtaatggcttgtcatggaggatgtgtaaatggtggtggacagcctcatgtaaacccaaaagatttagaaaaagtagacat 
aaaaaaagtaagagcttctgtattgtataatcaggatgaacatcmccaagagaaaatctcatgaaaatactgcattagttaaaatgtat 
aaattattttggcaaaccaggtgaaggtcgtgcccatgaaatattacactttaaatataaaaaataa 
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SEQ ID NO 1 75 : Desulfovibrio vulgaris hydrogenase; Genbank accession number X024 1 6 
atgagccgtaccgtcatggagcgcatcgaatatgagatgcacactccggaccccaaggccgatccggac^ 
atcgacgaggcaaagtgcataggctgcgacacctgttcgcagtactgcccca(Xgccgccat<mcggcgaaatgg^^^ 
5 ctccattccccacatcgaggcgtgcatcaactgcggccagtgcctcacgcactgccccgagaacgccatctacgaggcacagtc^ 
gtgcctgaagtcgagaagaagctgaaggacggcaaggtgaaatgcatcgccatgcccgcccccgccgtgcgctat^ 
acgccttcggcatgcccgtcggttccgtcaccaccggcaagatgctcgcggccctgcagaagctcggcttcgdc^^ 
ccgagttcaccgctgacgtgaccatctgggaagaggggtccgagttcgtggaacgcctcaccaagaagagcgacatgcc^ 
cagttcacctcgtgctgccccggctggcagaagtatgccgagacctactaccccgaactgctgccg^^ 

1 0 ccatcggcatgaacggcgcactggcgaagacctacggcgcagagcggatgaagtacgacccc^^ 
catgccctgcatcgcaaagaagtacgaagggttgcgtcccgaartgaagtccagcggcatgcgcgaca^^^ 
cccgtgagctggcctacatgatcaagaaggccggtatcgacncgcgaaactccccgacggcaa^ 
ccaccggcggtgccaccatcttcggcgtcaccggcggcgtcatggaagcggcactccgcttcgcct^ 
gaagccxgacagctgggacttcaaggccgtgcgcggtcttgatggcatcaaggaagccaccgtcaac^ 

15 aggtcgccgtggtgcacggggccaagcggttcaagcaggtctgcgacgatgtgaaggcgggcaagtcgcc^ 
tacatggcctgccccggcggctgcgtrtgtggcggcggtcagcccgtcatgcccggcgtgctcgaagc^^ 
ccgcctttacgcgggcctgaagaagcgcctcgccatggcgagcgccaacaaggcatag 

SEQ ID NO 176: Entamoeba histolytica hydrogenase; Genbank accession number 
20 AF248542 

atgccacctaaaccatcacatacactcaccggacatgaccataaccatagtattcaatttgattggtctaaatgcatgggttgtggaatgt 
gtgctactaaatgtacttttggggtgttagtaaaacaaccaccaaaaattccaccatttgttcagcctaatagagaaaaactcta^ 
aataccgacaagacaagagtacttattgatgagtctgaatgtactgggtgtggtcaatgttctttggtttgtaacm 
tagaccatcttgttgatacttttaaagctaaagaagctggaaagaagcttgttgctatgattgcaccttcaact 

25 ctatgggaatgcctattggaagtacagrtatggctcagttagttcattgtttaagacttattggatttgattatg^ 

gctgataagacaacaatggatgattatgccgaagttattgaaatgaaaaaagaaggaaaaggacctgctattacttcct 
ggattgaacttgttgaaaaagaatatcctgacttaattccaaacgtctctactgcccgttcaccaattggatgtttagct 
aggatgggcaaaggatgtaggaattgcagtagaagatctttacactgttggaataatgccUgtaltgc^^ 
acaacaaattcatcaagactatgatgcttcatgtacttcaaatgaaattgctgcttatttcaaaaaacatcttccacctgaagaat 

30 acacaagaaagagaagaagcacttgctaaaactgaagatggtcaatgtgatttaccatttagacgtatttctggtggttctaatat^ 
aagactggaggagtttgtgaaactgtattgagagtaattgcacgtaatgcaggagttgattggaacagttgtactgttaac^^ 
acttttaaacatgctgcaagtggatcaacaatgacaaatctttctgttgatattggtggaactattatcacaggtgctgttt^^ 
atgctattagacatgcttgtgaacttattagaaaaggagagttaaaagttgatgttgttgaaatgatggcatgtgttggaggttgtm 
gagcaggtcaaccaaaaattccaccagcaaagaaacttgagatggataagagaagagtaatgttagatattttagatcaacaaactga^ 



121 



Atloroey Docket Number: H2042101-CIP 
Inventor: Harrison F. Dillon 



attagagctgctaatgaaaatactgatgttcttggatggattgataaacatmgatcatcaaggtg 
ctcccagatatcaaaactaa 

SEQ ED NO 1 77: Scmedesmus obliquus hydrogenase; Genbank accession number 
5 AJ271546 

atgcctgagtggcaaccgggaggtcggtatgctgmctgtccgcccg(xagtgaacaggcgggctgtggtggcagcagagcgcag 
gcgccttgttgtgcgggcagctggcccaacagcagaatgtgattgcccaccagctcccgcgcccaaggccccgcactggcagcag 
acgctagatgagctagccaagcctaaggagcagcgcaaggtgatgatcgcccagatcgcaccagcagtgcgcgtggctattgcaga 
gaccatgggactcaaccctggggatgtgacagttggccagatggtgaccggcctgcgcatgctgggcmgattatgtgtttgacacgc 

10 tgtttggtgctgacctcaccatcatggaggagggcacagagctacggcacaggcttcaggaccacctggagcagcaccccaacaag 
gaggagccgctgcccatgttcaccagctgctgccctggctgggtggccatggtggagaagtccaaccccgagctcatcw 
tcttcctgcaagtcgccccagatgatgctgggcgcagtcatcaagaactacttcgctgccgaggcc^^ 
gcaacgtgagcgtgatgccctgcgtgcgcaagcagggcgaggctgaccgcgagtggttcaacaccacaggggctggc^^ 
acgtggaccacgtcatgacaactgcagagctgggcaagatctttgtggagcgcggaatcaagctgaacgacctgcag^^ 

15 tttgacaaccccgtcggcgagggcagcggcggcggcgtgctgttcggcaccactggaggcgtgatggaggcggcgct^^ 
gtgtacgaagtggtcacacagaagccWggaccgcatcgtctttgaggacgtgcgcggcrtggagggca^^ 
cacctcaccccaggccccaccagccccttcaaggccmgcaggcgcagacggcaccggcatcaccctcaacatcgcg^ 
cggcctcggcaatgccaagaagctcatcaagcagctggctgcaggcgagagcaagtacgacttcatcgaggtcatggcctgccccg 
gcggctgcatcggcggcggcggccagccgcgcagcgcggacaagcagatcctgcagaagcgccaggcggccatgtacgacctg 

20 gacgagcgcgcggtgatccggcgcagccacgagaacccgctgattggcgcgctgtatgagaagttcctgggcgagcccaacggc 
cacaaggcgcacgagctgctgcacacgcactacgtggccggcggcgtgcccgatgagaagtga 

SEQ ID NO 178: CMorella fiisca hydrogenase; Genbank accession number AJ298227 
atgtgttgccccgtggttgcaagtaggcacgcagggcgtgcaaggcatgttgctgtccgtgcagcagggccaaca 

25 gtcctccaacacctcaggccaagctgcctcactggcagcaggctctggatgagctcgccaagcccaaggagagcaggaggU 
atcgcgcaaatcgcctccgctgttcgtgtcgctattgctgagaccattggcttggccccaggagatgtcacM^ 
gggctgcgtatgcttggctttgattatgtcmgacaccctgmggtgctgacctgaccattalgg^^ 
gcctgcaggaccatctggagcagcaccccaacaaggaggagccactgcccatgttcaccagttgctgcccaggctgg^^ 
gUgaaaagagcaatcctgagctcatcccctacctgtcatcftgcaagtcgcctcagatgatgcttggggccgttatcaaga^ 

30 cacagcaggttggagtgcagcccagtgacatctgcaacgtgtcagtcatgccatgcgtacgcaagcagggagaggctgacc^^ 
gtggttcaacaccacaggtgcaggccttgcccgtgatgttgatcatgtggtgactactgctgaggttggtaagatattcctggagcgtgg 
catcaagctgaatgagctgccagagagcaactttgacaaccccattggcgagggcacaggtggtgctctgctgtttggcaccactgga 
ggtgtcatggaggcagcacttcgcacagtctatgaagtggtgacccagaagcccatgggtcgtgttgactttgaggaggtgcgaggc 
cttgaaggaatcaaggaggcagagatcacactcaagccaggagacgacagcccattcaaagccttcgcaggagctgatgggcagg 
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gcatcacgctcaagattgcagtagccaatgggcttggcaatgccaagaagctcatcaagagcctgtc^^ 
amcattgaggtcatggcatgccctggtggctgcattggcggaggcggtcagccccgc^ 
gccagcaggctatgtacaacctggatgagcgcagtaccatccgccgcagccatgata^ 
cctaggcgcacccaacagccacaaggcacatgatctgctgcacacacactatgtggcag^^ 

5 

SEQ ID NO 179: Synthetic construct green fluorescent protein mRNA; Genbank accession 
number AFl 88479 

atggccaagggcgaggagctgttcaccggtgtggtccccatcctggtggagctggacggcgacgtgaacggccacaagttctcc 
ctccggcgagggtgagggtgacgccacctacggcaagctgaccctgaagttcatctgcaccaccggcaagctgcccgtgccctggc 

10 ccaccctggtcaccaccctgacctacggtgtgcagtgcttctcccgctaccccgaccacatgaagcagcacgacttcttcM 
catgcccgagggctacgtgcaggagcgcaccatcttcttcaaggacgacggcaactacaagacccgcgccgaggtcaagt^^ 
ggcgacaccctggtgaaccgcatcgagctgaagggcatcgacttcaaggaggacggcaacatcctgggccacaagctggagtaca 
actacaactcccacaacgtgtacatcatggccgacaagcagaagaacggcatcaaggtgaacttcaagalc^^^ 
gacggctccgtgcagctggccgaccactaccagcagaacacccccatcggcgatggccccgtgctgctgcccgacaaccactacct 

15 gtccatccagtccgccctgtccaaggaccccaacgagaagcgcgaccacatggtcctgctggagttcgtcaccgctgcc^ 
ccacggcatggacgagctgtacaagtaa 

SEQ ID NO 180: Nonshuffled segment K: 

atccgtagttatccttatggccatcttagcgcagttgggtcaggggctggcgacgcgctgctgacgcgcaagtgaatggcccaacaag 
20 tcgcctcgcggtcgctgtcggcgccaaacccgcagctgcatccaccagattcacttgttagatcgacctaggttgcgggaccggagg 
cggctcgctgtgcaagcgcggtgacctcgtacggcggcatggatcgccatctcgattcgcgcggcagaatcgggccccgcgcacat 
ttaagccgcgggcgagactcatttcgttacgtgcatcgattaacagcttctggacctga 

SEQ ID NO 181 : Nonshuffled segment X: 

25 ttaaacgtcgtacgtccaagtataactaagccgacgtcgacccactctagaggatcgatccccgctccgtgt^^ 

gatctgagccttgccccctgacgaacggcggjggatggaagatactgctctcaagtgctgaagcggtagcttagctccccg^ 
tgatcagtctttttcaacacgtaaaaagcggaggagttttgcaattttgttggttgtaacgatcctccgttgatttt 
cgggctgggcgtatttggcagttgggtcaggggctggcgacgcgctgctgacgcgcaagtgaatggcccaacaagt 
gtcgctgtcggcgccaaacccgcagctgcatccaccagattcacttgttagatcgacctaggttgcgggaccggaggcgg 

30 gcaagcgcggtgacctcgtacggcggcatggatcgccatctcgattcgcgcggcagaatcgggccccgcgcacatttaagccgcg 
ggcgagactcatttcgttaaatctgatacatgctattcagatcttacaa 

SEQ ID NO 1 82: Nonshuffled segment XI: 
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tcttccatcgtaaatctagcatcgattag(xcgacgtcgacccactctagaggatcga^ 

atctgagccttgccccrtgacgaacggcggtggatggaagatactgctctcaagt^^ 

gatcagtcmmcaacacgtaaaaagcggaggagtWgcaatmgttggttgtaacgatcctc^ 

gggctgggcgtamggcagttggg^caggggctggcgacgcgctgctgacgcgcaagtg^^ 

cgctgtcggcgccaaacccgcagctgcalccaccagattcacttgttagatcgacctaggttgcgggaccggaggc^ 

caagcgcggtgacctcgtacggcggcatggatcgccatctcgattcgcgcggcagaatcgggccccgcgcacatt^ 

gcgagactcatttcgttaatctgtaataatctagtcgaggcattcaag 

SEQ ID NO 1 83 : Nonshuffled segment XII: 

atctgtaataatctagtcgaggcattcaagatggccaagggcgaggagctgttcaccggtgt^ 

gcgacgtgaacggccacaagttctccgtctccggcgagggtgagggtgacgccacctacggc^^ 

accaccggcaagctgcccgtgccctggcccaccctggtcaccaccctgacctacggtgt^^ 

atgaagcagcacgacttcttcaagtccgccatgcccgagggctacgtgcaggagcgcacc^^^ 

aagacccgcgccgaggtcaagttcgagggcgacaccctggtgaaccgcatcgagctgaagggcat 

aacatcctgggccacaagctggagtacaactacaactcccacaacgtgtacatcatggccg 

gaacttcaagatccgcxacaacatcgaggacggctccgtgcagctggccgac^actaccagc^ 

cccgtgctgctgcccgacaaccactacctgtccatccagtccgccctgtccaaggaccccaacgagM^ 

ctggagttcgtcaccgctgccggcatcacccacggcatggacgagctgtacaagtaaaactggcttaaatcgttaac^ 

SEQ ID NO 184: Nonshuffled segment XIII: 

aactggcttaaatcgttaacaatcgtgtgaccgacgtcgacccactctagaggatcgatccccgctccg^ 
gatctgagccttgccccctgacgaacggcggtggatggaagatactgctc^caagtgctgaagcggtagcttagct^ 
tgatcagtctttttcaacacgtaaaaagcggaggagtmgcaattttgttggttgtaacgatcctc^ 
cgggctgggcgtatttggatttaacataactgtcgattaccgtgcga 
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